INDEX
Explanations
mentions of political figures, particularly senators
instances of the word "Senator."
New Auto-Interp
Negative Logits
tera
-0.67
ãĥ¼ãĥ³
-0.67
aster
-0.66
lot
-0.66
brim
-0.66
heading
-0.64
arching
-0.64
leash
-0.63
anked
-0.61
oper
-0.61
POSITIVE LOGITS
Senator
0.98
ileaks
0.91
Senator
0.90
ial
0.85
Barack
0.84
Bernie
0.80
iors
0.79
eca
0.77
senator
0.76
clinton
0.75
Activations Density 0.013%