INDEX
Explanations
occurrences of the substring "en" in various contexts
New Auto-Interp
Negative Logits
z
-0.53
ch
-0.49
es
-0.45
th
-0.35
zá
-0.34
ne
-0.33
za
-0.31
o
-0.31
zelf
-0.30
zell
-0.30
POSITIVE LOGITS
jandro
0.15
velope
0.14
ey
0.14
stitial
0.14
esis
0.13
viron
0.13
joy
0.13
warz
0.13
تاب
0.13
eni
0.12
Activations Density 0.051%