INDEX
Explanations
conjunctions and connective phrases
New Auto-Interp
Negative Logits
æĻ
-0.16
égor
-0.15
DownList
-0.15
wnd
-0.14
adium
-0.14
gue
-0.14
pii
-0.14
chter
-0.14
ilos
-0.14
rava
-0.14
POSITIVE LOGITS
ing
0.17
-
0.16
ingo
0.16
iesel
0.16
others
0.15
co
0.15
ita
0.15
dek
0.15
cent
0.15
deck
0.15
Activations Density 0.078%