INDEX
Explanations
repeated instances of the letter "b."
New Auto-Interp
Negative Logits
gil
-0.16
uxt
-0.15
Name
-0.14
yo
-0.14
kari
-0.14
deniz
-0.14
zew
-0.14
åĮ
-0.14
wre
-0.14
egra
-0.14
POSITIVE LOGITS
b
0.45
itters
0.21
б
0.20
oval
0.18
(b
0.18
*b
0.17
.b
0.17
ison
0.17
=b
0.16
ellig
0.15
Activations Density 0.034%