INDEX
Explanations
mentions of the name "Ben" followed by a number (9 or 10)
mentions of the name "Ben"
New Auto-Interp
Negative Logits
REDACTED
-0.84
DragonMagazine
-0.82
æĸ¹
-0.79
inarily
-0.77
perse
-0.73
mson
-0.71
£ı
-0.71
åŃ
-0.70
ngth
-0.70
mble
-0.68
POSITIVE LOGITS
jamin
1.49
oit
1.00
nington
0.97
cher
0.93
chers
0.93
utz
0.86
arding
0.83
ito
0.82
Franklin
0.81
isher
0.81
Activations Density 0.009%