INDEX
Explanations
occurrences of the letter "b" in various contexts
New Auto-Interp
Negative Logits
ooth
-0.20
ern
-0.19
oo
-0.17
ounder
-0.17
yre
-0.16
ara
-0.16
usch
-0.16
igg
-0.16
ezier
-0.15
ived
-0.15
POSITIVE LOGITS
em
0.23
im
0.21
t
0.20
hatt
0.19
oc
0.19
ok
0.19
bery
0.19
ex
0.18
ility
0.18
acteria
0.18
Activations Density 0.164%