INDEX
Explanations
the name "Joe" with considerable strength
mentions of the name "Joe."
New Auto-Interp
Negative Logits
NESS
-0.92
hips
-0.87
glim
-0.79
igated
-0.73
igators
-0.72
mble
-0.70
igator
-0.69
narrator
-0.69
ample
-0.69
raints
-0.67
POSITIVE LOGITS
Biden
0.99
pport
0.90
ppo
0.89
Arpaio
0.89
zzi
0.86
Rog
0.81
Scarborough
0.80
Camel
0.79
xtap
0.79
Russo
0.79
Activations Density 0.029%