INDEX
Explanations
proper nouns related to politics and media
references to specific organizations or people associated with political commentary
New Auto-Interp
Negative Logits
aton
-0.84
fil
-0.82
hog
-0.77
film
-0.73
ands
-0.72
byter
-0.70
elled
-0.68
abilities
-0.66
ouri
-0.66
atted
-0.66
POSITIVE LOGITS
Leilan
0.81
ante
0.76
sburgh
0.73
Roz
0.71
Hots
0.70
apy
0.70
Thatcher
0.63
azines
0.63
Phoenix
0.61
mopolitan
0.60
Activations Density 0.059%