INDEX
Explanations
mentions of political figures and election-related words
topics related to political commentary and analysis
New Auto-Interp
Negative Logits
hens
-0.80
Variant
-0.80
cause
-0.76
Modified
-0.73
ocation
-0.68
Hide
-0.66
IPM
-0.65
Join
-0.65
isable
-0.65
reads
-0.62
POSITIVE LOGITS
already
1.05
admittedly
0.97
rarely
0.93
hardly
0.90
supposedly
0.86
notoriously
0.81
surely
0.80
arguably
0.79
previously
0.78
practically
0.78
Activations Density 0.756%