INDEX
Explanations
the name "Mary" and its variations in different contexts
New Auto-Interp
Negative Logits
mach
-0.15
irl
-0.15
ayet
-0.15
kova
-0.15
Fur
-0.15
staw
-0.15
tega
-0.15
leys
-0.15
دار
-0.15
IDER
-0.14
POSITIVE LOGITS
Ann
0.21
ann
0.20
beth
0.19
Ann
0.19
ellen
0.19
ke
0.18
trs
0.18
sville
0.18
weather
0.18
mount
0.18
Activations Density 0.009%