INDEX
Explanations
references to political ideology or affiliation, specifically focusing on the political "Right."
references to right-wing political ideologies or groups
New Auto-Interp
Negative Logits
ĸļ
-0.80
ulative
-0.78
Mehran
-0.75
srfAttach
-0.71
cit
-0.69
Constant
-0.63
aden
-0.62
Santos
-0.61
minecraft
-0.61
Unloaded
-0.61
POSITIVE LOGITS
wing
1.11
eous
1.01
ward
1.00
wing
0.91
move
0.82
lander
0.81
winger
0.78
tarian
0.74
shore
0.73
heirs
0.73
Activations Density 0.041%