INDEX
Explanations
references to Poland or Polish identity
New Auto-Interp
Negative Logits
lay
-0.16
oras
-0.15
lä
-0.15
el
-0.15
elle
-0.15
Middleton
-0.15
amura
-0.15
elson
-0.14
elo
-0.14
disposing
-0.14
POSITIVE LOGITS
onia
0.24
onus
0.20
onica
0.20
aris
0.20
noc
0.20
anco
0.19
onium
0.19
ity
0.19
noon
0.18
vere
0.18
Activations Density 0.012%