INDEX
Explanations
mentions of the name "Sanders."
mentions of political figures and events
New Auto-Interp
Negative Logits
gling
-0.56
(#
-0.55
itar
-0.53
leigh
-0.53
Mechanical
-0.52
LR
-0.52
usercontent
-0.51
Mons
-0.51
chers
-0.50
sth
-0.50
POSITIVE LOGITS
explodes
0.53
accompan
0.52
=]
0.50
blister
0.50
accuser
0.50
ccording
0.50
Charges
0.49
rejects
0.49
]);
0.49
slams
0.49
Activations Density 0.030%