INDEX
Explanations
phrases related to political news and events
topics related to political events and figures
New Auto-Interp
Negative Logits
ODUCT
-0.80
Newsletter
-0.79
BILITIES
-0.74
ILCS
-0.74
76561
-0.72
NetMessage
-0.71
roleum
-0.69
iably
-0.69
oaded
-0.67
cffffcc
-0.67
POSITIVE LOGITS
Replay
1.15
?'
0.96
]'
0.95
!'
0.85
'
0.83
?]
0.82
',
0.82
']
0.82
'.
0.78
Dems
0.78
Activations Density 0.676%