INDEX
Explanations
mentions of specific names, particularly related to political figures
references to Donald Trump and related political activity
New Auto-Interp
Negative Logits
rica
-0.74
eret
-0.74
hap
-0.69
ilver
-0.67
andre
-0.65
laus
-0.64
ruary
-0.63
arch
-0.62
arts
-0.61
lag
-0.61
POSITIVE LOGITS
Recomm
0.73
Construct
0.60
holiest
0.59
explodes
0.59
Senate
0.57
tariffs
0.57
reacts
0.57
interference
0.56
quitting
0.54
indu
0.54
Activations Density 0.036%