INDEX
Explanations
the word "it" and variations of its usage throughout the text
New Auto-Interp
Negative Logits
anova
-0.16
Streams
-0.15
ULD
-0.15
ahoma
-0.15
лÑıд
-0.14
ogne
-0.14
exe
-0.14
orks
-0.14
بÙĨ
-0.14
.Unicode
-0.13
POSITIVE LOGITS
own
0.28
self
0.21
SELF
0.21
èĩªå·±
0.21
own
0.20
èĩªå·±çļĦ
0.19
-self
0.19
iner
0.19
selves
0.18
OWN
0.18
Activations Density 0.222%