INDEX
Explanations
the presence of the letter 'D' in various contexts throughout the text
New Auto-Interp
Negative Logits
ÑģÑĤÑĢÑĥ
-0.18
ies
-0.17
UDO
-0.15
št
-0.15
irma
-0.14
io
-0.14
umbles
-0.14
bia
-0.14
byter
-0.14
utherford
-0.14
POSITIVE LOGITS
dual
0.15
antan
0.15
el
0.15
arr
0.14
osta
0.14
ablo
0.14
iky
0.14
Exec
0.14
dep
0.14
ar
0.13
Activations Density 0.086%