INDEX
Explanations
instances of the word "Der" and its variations, which are indicative of references to titles or names in German contexts
New Auto-Interp
Negative Logits
igkeit
-0.16
taire
-0.15
ATUS
-0.15
alled
-0.15
è¾
-0.15
ãĥ£
-0.15
stown
-0.15
lover
-0.15
runner
-0.15
yr
-0.15
POSITIVE LOGITS
anged
0.27
ivative
0.24
angement
0.24
ksen
0.21
went
0.20
neÄŁi
0.20
ivation
0.19
ivery
0.19
wish
0.19
udder
0.17
Activations Density 0.014%