INDEX
Explanations
modal verbs indicating permission or prohibition
New Auto-Interp
Negative Logits
exion
-0.17
enson
-0.16
ursive
-0.15
stantiate
-0.14
welcome
-0.14
elly
-0.14
/*@
-0.14
ago
-0.14
componentDid
-0.14
bst
-0.14
POSITIVE LOGITS
ستاÙĨÛĮ
0.16
ewe
0.15
Household
0.14
dar
0.14
alion
0.14
nao
0.13
침
0.13
zdy
0.13
uale
0.13
度
0.13
Activations Density 0.013%