INDEX
Explanations
the word "ben" with different numeric activations
the presence of the term "ben" in various contexts
New Auto-Interp
Negative Logits
Kinnikuman
-0.83
WD
-0.66
inarily
-0.65
RS
-0.64
mson
-0.64
earable
-0.62
displayText
-0.61
matic
-0.60
CV
-0.60
TPPStreamerBot
-0.58
POSITIVE LOGITS
jamin
1.62
ghazi
0.97
furt
0.92
nington
0.89
chers
0.88
cher
0.85
heimer
0.84
acon
0.83
emies
0.82
ben
0.81
Activations Density 0.015%