INDEX
Explanations
significant pronouns and verbs expressing agency or action
New Auto-Interp
Negative Logits
.writeln
-0.16
adu
-0.15
cio
-0.15
ÛĮرÛĮ
-0.14
ilm
-0.14
ibus
-0.14
ItemImage
-0.14
mium
-0.14
ximity
-0.14
none
-0.14
POSITIVE LOGITS
odst
0.15
xD
0.15
Beaut
0.14
Tod
0.14
immediately
0.14
-solid
0.14
immediate
0.13
éĩı
0.13
itis
0.13
далÑĮ
0.13
Activations Density 0.004%