INDEX
Explanations
variations of the word "le" and its forms
New Auto-Interp
Negative Logits
duit
-0.16
osi
-0.15
Demp
-0.15
acket
-0.15
pery
-0.15
anke
-0.14
Diss
-0.14
emale
-0.14
allback
-0.14
apesh
-0.14
POSITIVE LOGITS
istung
0.22
hr
0.22
ist
0.21
bens
0.20
icht
0.20
hra
0.19
hn
0.19
erra
0.19
ute
0.18
hrs
0.18
Activations Density 0.007%