INDEX
Explanations
occurrences of the word "do" in various forms and contexts
New Auto-Interp
Negative Logits
nya
-0.18
ka
-0.18
lify
-0.17
ervas
-0.16
noon
-0.16
uelle
-0.15
ulous
-0.15
wo
-0.15
elay
-0.15
apolis
-0.15
POSITIVE LOGITS
ÅĤÄħ
0.20
zens
0.19
berman
0.17
ehler
0.17
ctype
0.17
osing
0.16
ob
0.16
,
0.15
ress
0.15
leta
0.15
Activations Density 0.065%