INDEX
Negative Logits
gender
-0.07
>$
-0.06
foes
-0.06
defamation
-0.06
遭
-0.06
older
-0.05
ocations
-0.05
Cyril
-0.05
ação
-0.05
ьв
-0.05
POSITIVE LOGITS
Institute
0.13
institute
0.12
institutes
0.09
Instituto
0.08
Viện
0.08
_Report
0.07
programme
0.07
(in
0.07
plank
0.07
resolve
0.07
Activations Density 0.009%