INDEX
Explanations
mentions of political party affiliations, particularly Republican and Democratic
New Auto-Interp
Negative Logits
á»įc
-0.08
Stateless
-0.07
hi
-0.07
alam
-0.07
hee
-0.07
_:*
-0.07
pository
-0.07
155
-0.07
semblies
-0.06
ظر
-0.06
POSITIVE LOGITS
-dominated
0.08
-leaning
0.07
ajas
0.06
-led
0.06
-controlled
0.06
-run
0.06
ex
0.06
ipel
0.06
-friendly
0.06
ERY
0.06
Activations Density 0.010%