INDEX
Explanations
names of individuals
the name "Jake" in various contexts
New Auto-Interp
Negative Logits
conduc
-0.78
amera
-0.70
iary
-0.69
iated
-0.68
subst
-0.66
inals
-0.65
arily
-0.65
ributed
-0.64
inances
-0.64
glim
-0.63
POSITIVE LOGITS
glers
0.94
Jake
0.89
unin
0.83
Gy
0.83
imaru
0.82
Skywalker
0.81
EStream
0.79
ango
0.77
warm
0.77
ansas
0.76
Activations Density 0.012%