INDEX
Explanations
mentions of the name "Todd" at various activations
mentions of the name "Todd."
New Auto-Interp
Negative Logits
andum
-0.75
ugal
-0.69
naire
-0.69
mileage
-0.68
ĺħ
-0.66
prus
-0.64
incent
-0.62
aries
-0.62
enary
-0.61
bnb
-0.60
POSITIVE LOGITS
lers
1.30
ler
1.17
Akin
0.94
McF
0.92
Whitman
0.90
leness
0.88
Frazier
0.83
Haley
0.83
LER
0.82
Gur
0.81
Activations Density 0.018%