INDEX
Explanations
the presence of specific organizational names and titles
New Auto-Interp
Negative Logits
ongs
-0.15
xon
-0.15
yg
-0.15
ple
-0.15
lok
-0.14
uts
-0.14
ito
-0.14
ÑĢаÑĩ
-0.14
nak
-0.14
igg
-0.13
POSITIVE LOGITS
incom
0.17
undef
0.17
¿
0.17
odore
0.17
folks
0.16
äºŃ
0.16
nackte
0.16
ehir
0.16
iosper
0.16
firm
0.15
Activations Density 0.140%