INDEX
Explanations
words related to political positions or stances on various issues
references to political positions or stances
New Auto-Interp
Negative Logits
ennes
-0.76
Champ
-0.72
Chel
-0.69
à¨
-0.67
agar
-0.64
INESS
-0.64
aniel
-0.63
@@@@
-0.62
flies
-0.62
anova
-0.61
POSITIVE LOGITS
behalf
1.59
erous
0.99
matters
0.90
yx
0.89
occasion
0.87
steroids
0.85
shore
0.83
Capitol
0.81
eworld
0.81
eness
0.80
Activations Density 0.203%