INDEX
Explanations
references to the name "Ted" and its variations
New Auto-Interp
Negative Logits
lish
-0.17
ohl
-0.15
inding
-0.15
_HAL
-0.15
olik
-0.15
ivor
-0.15
heiro
-0.14
ën
-0.14
etler
-0.14
eldom
-0.14
POSITIVE LOGITS
dy
0.27
esco
0.18
ious
0.18
Ted
0.17
IOUS
0.17
dy
0.17
di
0.17
TED
0.17
Bundy
0.16
ros
0.16
Activations Density 0.009%