INDEX
Explanations
references to individuals or groups involved in social or community contexts
New Auto-Interp
Negative Logits
iverz
-0.17
itself
-0.16
swire
-0.15
ezi
-0.15
agli
-0.14
ameda
-0.14
бÑĥдÑĮ
-0.14
šet
-0.14
someone
-0.13
ÅĤy
-0.13
POSITIVE LOGITS
who
0.45
'
0.39
who
0.38
whom
0.36
’
0.35
اÙĦذÙĬÙĨ
0.30
themselves
0.30
hips
0.29
/operators
0.28
kteÅĻÃŃ
0.27
Activations Density 0.878%