INDEX
Explanations
mentions of people's names, specifically variations of the name "Ben"
repeated mentions of the name "Ben."
New Auto-Interp
Negative Logits
inarily
-0.75
REDACTED
-0.74
æĸ¹
-0.74
DragonMagazine
-0.72
mson
-0.69
perse
-0.68
åŃ
-0.67
ngth
-0.67
hower
-0.65
âĶģ
-0.65
POSITIVE LOGITS
jamin
1.50
oit
1.02
nington
1.02
chers
0.99
cher
0.96
ches
0.86
utz
0.84
imaru
0.83
ghazi
0.83
isher
0.82
Activations Density 0.015%