INDEX
Explanations
references to individuals or organizations involved in social issues or activism
New Auto-Interp
Negative Logits
idia
-0.15
ibold
-0.15
اÙĦÛĮا
-0.14
iras
-0.14
ANC
-0.14
plash
-0.14
ecc
-0.14
ransition
-0.14
oga
-0.13
Pett
-0.13
POSITIVE LOGITS
Hanging
0.17
rein
0.17
guests
0.16
segment
0.15
ushi
0.15
upy
0.15
avn
0.15
resh
0.14
Guests
0.14
estring
0.14
Activations Density 0.034%