INDEX
Explanations
phrases related to political discontent and social issues
New Auto-Interp
Negative Logits
performan
-0.24
<↵
-0.19
CORPOR
-0.16
(↵
-0.16
ustum
-0.16
ho
-0.15
urrenc
-0.15
=(↵
-0.15
urette
-0.15
_ma
-0.15
POSITIVE LOGITS
cor
0.18
sim
0.16
iesz
0.16
sud
0.15
aman
0.15
prom
0.15
enos
0.15
ÏĢί
0.14
nof
0.14
ief
0.14
Activations Density 0.029%