INDEX
Explanations
terms related to activism and organizers involved in social causes
New Auto-Interp
Negative Logits
alist
-0.16
aison
-0.16
/store
-0.15
ousse
-0.15
лÑİ
-0.14
Guild
-0.14
alian
-0.14
hin
-0.14
elsen
-0.14
γά
-0.14
POSITIVE LOGITS
yth
0.16
ropy
0.16
ropa
0.16
Levin
0.15
¶ļ
0.14
against
0.14
olute
0.14
ithmetic
0.14
ÏĦία
0.13
apol
0.13
Activations Density 0.013%