INDEX
Explanations
instances of the letter "b" and related variations in different contexts
New Auto-Interp
Negative Logits
l
-0.27
oles
-0.26
ìĿ´
-0.24
ar
-0.21
lad
-0.21
oi
-0.21
oj
-0.20
rp
-0.20
oor
-0.20
oft
-0.20
POSITIVE LOGITS
eyond
0.21
ey
0.19
bery
0.19
ek
0.18
eye
0.18
ale
0.18
egal
0.18
egg
0.18
ellow
0.17
ocop
0.17
Activations Density 0.077%