INDEX
Explanations
occurrences of the word "de" and its variations in different contexts
New Auto-Interp
Negative Logits
人
-0.16
ildren
-0.15
ieber
-0.14
kaar
-0.14
igers
-0.14
rrha
-0.14
iaux
-0.14
nyder
-0.14
elm
-0.14
plementation
-0.14
POSITIVE LOGITS
way
0.20
manera
0.17
way
0.17
acuerdo
0.17
forma
0.17
str
0.16
WAY
0.16
note
0.16
961
0.16
manière
0.15
Activations Density 0.024%