INDEX
Explanations
indicators of political criticism or disapproval
New Auto-Interp
Head Attr Weights
0:0.09
1:0.06
2:0.07
3:0.08
4:0.09
5:0.09
6:0.08
7:0.08
8:0.09
9:0.07
10:0.06
11:0.09
Negative Logits
Corinth
-1.82
AMA
-1.74
Savannah
-1.69
bri
-1.67
Paula
-1.59
Lump
-1.56
Fortune
-1.55
ALE
-1.50
��
-1.50
overtime
-1.50
POSITIVE LOGITS
enaries
1.90
ilts
1.85
rogens
1.77
ctors
1.74
reed
1.71
Wiki
1.70
erion
1.70
YD
1.62
External
1.62
sei
1.61
Activations Density 0.000%