INDEX
Explanations
references to "people" and their connected rights or issues
New Auto-Interp
Negative Logits
ìľ¨
-0.17
uber
-0.17
inia
-0.16
olia
-0.14
asca
-0.14
aber
-0.14
reck
-0.14
ault
-0.14
fers
-0.14
Ding
-0.14
POSITIVE LOGITS
zik
0.16
827
0.15
اÙĨÚ¯
0.15
üstü
0.15
adult
0.15
ãģ¡ãģ¯
0.14
/org
0.14
brace
0.13
Ric
0.13
disarm
0.13
Activations Density 0.046%