INDEX
Explanations
expressions of emotional distress and assertiveness in confrontational contexts
New Auto-Interp
Negative Logits
ouv
-0.16
hạng
-0.15
anship
-0.15
Ih
-0.15
mund
-0.14
OUCH
-0.14
пеÑĢек
-0.14
eming
-0.14
mine
-0.14
idor
-0.14
POSITIVE LOGITS
man
0.68
bro
0.45
baby
0.41
Man
0.38
MAN
0.37
-man
0.37
man
0.34
_man
0.34
brother
0.33
mate
0.32
Activations Density 0.349%