INDEX
Explanations
negative qualifiers and words indicating limitation or restriction
New Auto-Interp
Negative Logits
anzi
-0.16
amina
-0.14
annies
-0.14
lian
-0.13
ãĤĦãģĻ
-0.13
glo
-0.13
AMI
-0.13
teri
-0.13
å¼¥
-0.13
/fs
-0.13
POSITIVE LOGITS
enville
0.17
rych
0.14
ethyst
0.14
dük
0.14
emean
0.14
umba
0.14
inki
0.13
PLE
0.13
åĽº
0.13
oggler
0.13
Activations Density 0.092%