INDEX
Explanations
references to animal care and adoption
words and phrases that appear in recipe or cooking instructions, particularly ingredient lists and measurement units.
New Auto-Interp
Negative Logits
Lingkungan
-0.42
prakty
-0.41
arbete
-0.41
wobec
-0.40
arbeta
-0.40
strutture
-0.40
aktivitet
-0.40
moviliz
-0.40
indywidual
-0.39
güçlü
-0.39
POSITIVE LOGITS
fucking
0.96
hipster
0.90
fuckin
0.88
goddamn
0.84
fucking
0.84
shitty
0.82
assholes
0.82
fuck
0.80
asshole
0.79
hilar
0.79
Activations Density 1.707%