INDEX
Explanations
mentions of political figures or topics
references to political themes or entities
New Auto-Interp
Negative Logits
LEASE
-0.74
ENDED
-0.70
VIEW
-0.70
lighting
-0.69
noon
-0.67
Carbuncle
-0.66
upon
-0.65
ounty
-0.65
RIS
-0.65
pity
-0.65
POSITIVE LOGITS
Polit
1.30
ifact
1.08
icians
1.06
ician
0.94
icial
0.88
ically
0.87
paran
0.85
correctness
0.83
Polit
0.80
tabl
0.78
Activations Density 0.006%