INDEX
Explanations
specific references to locations or geographical significance
New Auto-Interp
Negative Logits
dziew
-0.17
yyn
-0.16
zbyt
-0.16
imli
-0.15
assignable
-0.15
erie
-0.15
Paladin
-0.14
zdrav
-0.14
باش
-0.14
zers
-0.14
POSITIVE LOGITS
pend
0.20
peril
0.19
unsur
0.19
perl
0.18
peng
0.18
gay
0.18
pel
0.18
kon
0.17
per
0.17
sos
0.17
Activations Density 0.003%