INDEX
Explanations
themes related to struggle and advocacy for social justice
New Auto-Interp
Negative Logits
ια
-0.16
sex
-0.15
же
-0.14
aight
-0.14
alls
-0.14
ales
-0.14
opher
-0.14
imes
-0.13
/read
-0.13
zego
-0.13
POSITIVE LOGITS
iram
0.18
InProgress
0.16
erto
0.15
ÅĻÃŃ
0.15
\Tests
0.15
537
0.14
Bernstein
0.14
ucu
0.14
hetic
0.14
εÏģÏĮ
0.13
Activations Density 0.080%