INDEX
Explanations
the name "beba" or similar variations
instances of the substring "ba" in various contexts
New Auto-Interp
Negative Logits
Attention
-0.79
ICLE
-0.70
worldly
-0.70
Uncommon
-0.69
ivities
-0.67
lessly
-0.66
externalActionCode
-0.65
andem
-0.64
Dragonbound
-0.64
Principle
-0.64
POSITIVE LOGITS
ques
1.05
ba
1.03
ñ
0.97
uble
0.89
aba
0.88
ffe
0.86
bsite
0.82
olean
0.81
velength
0.80
que
0.78
Activations Density 0.004%