INDEX
Explanations
proper names
mentions of the name "Jay."
New Auto-Interp
Negative Logits
iture
-0.83
ITIES
-0.80
milo
-0.79
ENTION
-0.75
htaking
-0.72
ACTED
-0.71
idad
-0.69
ancial
-0.67
IRE
-0.65
cffff
-0.65
POSITIVE LOGITS
hawks
1.25
hawk
1.13
walking
1.02
haw
0.95
den
0.94
lon
0.92
jay
0.89
Jay
0.88
bird
0.88
pee
0.86
Activations Density 0.019%