INDEX
Explanations
mentions of individuals named "Joe" with varying degrees of significance
New Auto-Interp
Negative Logits
NESS
-0.89
hips
-0.87
glim
-0.78
igators
-0.75
igated
-0.73
mble
-0.72
ancies
-0.70
iments
-0.68
chwitz
-0.68
narrator
-0.67
POSITIVE LOGITS
Biden
0.96
ppo
0.92
pport
0.89
Arpaio
0.87
zzi
0.83
Camel
0.82
xtap
0.81
Rog
0.80
Lieberman
0.79
xon
0.79
Activations Density 0.016%