INDEX
Explanations
mentions of the political figure "Cruz"
New Auto-Interp
Negative Logits
à¨
-0.70
OX
-0.67
HMS
-0.66
neur
-0.66
ebin
-0.65
unseen
-0.64
Seym
-0.64
Pebble
-0.63
OA
-0.61
bye
-0.61
POSITIVE LOGITS
Cruz
1.08
Cruz
1.03
Rubio
0.84
omics
0.84
Caucus
0.83
supporters
0.81
ettes
0.78
supporter
0.76
Rafael
0.75
itus
0.75
Activations Density 0.060%