INDEX
Explanations
words associated with various forms of disenfranchisement and discontent
New Auto-Interp
Negative Logits
šem
-0.15
elson
-0.15
à¸Ĺà¸Ńà¸ĩ
-0.14
zet
-0.14
sto
-0.14
Ñħи
-0.14
å¯Į
-0.14
ceptors
-0.14
abra
-0.14
kinson
-0.14
POSITIVE LOGITS
/dis
0.21
(dis
0.20
dis
0.20
Dis
0.20
Dis
0.18
-dis
0.18
.dis
0.17
zung
0.17
enance
0.16
DIS
0.16
Activations Density 0.038%