INDEX
Explanations
references to political figures, especially Donald Trump and associated events
references to Donald Trump and his associates in political contexts
New Auto-Interp
Negative Logits
mination
-0.75
REL
-0.69
ONY
-0.67
âĶĢ
-0.65
Contents
-0.64
ridor
-0.64
ante
-0.63
needle
-0.63
diagnostic
-0.63
ahi
-0.63
POSITIVE LOGITS
pard
0.81
Pence
0.78
whis
0.78
Cheney
0.77
Putin
0.74
veto
0.70
ï
0.68
Putin
0.67
retweet
0.67
impuls
0.66
Activations Density 0.379%