INDEX
Explanations
phrases related to political events and societal issues, particularly focusing on expressions of disenfranchisement
expressions of feelings associated with disenfranchisement and social discomfort
New Auto-Interp
Head Attr Weights
0:0.04
1:0.02
2:0.05
3:0.06
4:0.04
5:0.08
6:0.03
7:0.06
8:0.32
9:0.06
10:0.12
11:0.06
Negative Logits
endorsements
-1.16
atern
-1.14
surrog
-1.10
litter
-1.10
Yel
-1.06
raids
-1.05
counties
-1.04
alt
-1.03
conglomer
-1.03
gangs
-1.02
POSITIVE LOGITS
raq
1.33
uko
1.29
GoldMagikarp
1.27
QUI
1.23
��
1.23
happiest
1.23
ergic
1.17
lime
1.16
nah
1.15
amiliar
1.15
Activations Density 0.075%