INDEX
Explanations
variations of the word "ein" (meaning "a" or "one" in German)
New Auto-Interp
Negative Logits
dess
-0.17
ille
-0.17
ks
-0.16
a
-0.15
jev
-0.15
kov
-0.15
uby
-0.14
gal
-0.14
sy
-0.14
sq
-0.14
POSITIVE LOGITS
zel
0.27
wo
0.23
igen
0.23
heit
0.23
zelf
0.22
heiten
0.22
iges
0.21
heits
0.20
ige
0.20
zig
0.19
Activations Density 0.007%