INDEX
Explanations
terms related to human rights organizations and activities
New Auto-Interp
Negative Logits
ckt
-0.15
inval
-0.15
acro
-0.14
BA
-0.13
apol
-0.13
ÃŃr
-0.13
ewis
-0.13
ektor
-0.13
anus
-0.13
yro
-0.13
POSITIVE LOGITS
oli
0.16
reds
0.16
paque
0.15
czy
0.15
vil
0.14
niej
0.14
843
0.14
abled
0.14
ient
0.14
att
0.14
Activations Density 0.012%