INDEX
Explanations
reporting on political actions and statements
New Auto-Interp
Negative Logits
olars
-0.77
ept
-0.73
aic
-0.72
zens
-0.70
byss
-0.69
iatures
-0.68
"}],"
-0.67
ateral
-0.67
haar
-0.67
pite
-0.66
POSITIVE LOGITS
himself
1.00
aloud
0.93
Mexicans
0.87
openly
0.86
famously
0.82
homosexuals
0.81
insulting
0.80
onstage
0.80
gays
0.80
controversial
0.79
Activations Density 0.232%