INDEX
Explanations
phrases related to political figures, news broadcasting, and commentary
New Auto-Interp
Negative Logits
inates
-0.91
relative
-0.76
RED
-0.71
urally
-0.67
metic
-0.66
cles
-0.66
ifice
-0.65
mine
-0.65
holes
-0.64
ifiable
-0.64
POSITIVE LOGITS
star
0.95
Morning
0.94
Consult
0.85
Madness
0.82
night
0.81
Glory
0.77
Dew
0.77
aturdays
0.77
Caller
0.76
Evening
0.74
Activations Density 0.023%