INDEX
Explanations
mentions of politicians
terms related to political discussions and commentary
New Auto-Interp
Negative Logits
uran
-0.72
Warrant
-0.71
LEASE
-0.70
upon
-0.70
RED
-0.69
pity
-0.68
urous
-0.68
VIEW
-0.68
unction
-0.67
orney
-0.66
POSITIVE LOGITS
icians
1.07
ifact
1.03
Polit
1.01
ician
0.95
icial
0.94
ically
0.88
correctness
0.85
eness
0.84
Pengu
0.77
paran
0.76
Activations Density 0.009%