INDEX
Explanations
mentions of the name "Todd" or similar words related to it
todd, toddlers, TOD
New Auto-Interp
Negative Logits
<eos>
-0.50
Hiller
-0.42
Schar
-0.41
↵↵
-0.40
ar
-0.39
Blume
-0.38
er
-0.38
Backman
-0.37
ilber
-0.37
Korn
-0.37
POSITIVE LOGITS
Todd
2.19
Todd
2.09
todd
1.66
todd
1.53
TOD
1.29
Toddler
1.02
Tod
1.02
myſelf
0.99
Tod
0.98
toddlers
0.97
Activations Density 0.004%