INDEX
Explanations
phrases related to social justice and economic critiques
New Auto-Interp
Negative Logits
ysz
-0.18
Conc
-0.15
iyas
-0.15
conc
-0.15
yna
-0.14
ï¼ŀ
-0.14
iya
-0.14
æĽ°
-0.14
λÎŃ
-0.14
ıt
-0.14
POSITIVE LOGITS
eton
0.16
Roe
0.16
thalm
0.15
ocre
0.14
tim
0.14
aron
0.14
umption
0.14
double
0.14
088
0.13
Publications
0.13
Activations Density 0.659%