INDEX
Explanations
mentions of the name "Bernard" and variations of "Bernie Sanders."
New Auto-Interp
Negative Logits
rawer
-0.17
petition
-0.16
_CSR
-0.15
sgi
-0.15
aÅŁÄ±
-0.15
олоÑģ
-0.15
Filled
-0.14
itech
-0.14
riority
-0.14
det
-0.14
POSITIVE LOGITS
ini
0.17
Sanders
0.17
yz
0.17
once
0.17
Bernie
0.16
eger
0.16
elda
0.15
ine
0.15
Gall
0.15
otas
0.15
Activations Density 0.010%