INDEX
Negative Logits
良
-0.07
düğü
-0.07
ileceği
-0.06
argins
-0.06
uname
-0.06
polate
-0.06
าษฎร
-0.06
unsustainable
-0.06
("/")↵-0.06
Trong
-0.06
POSITIVE LOGITS
job
0.07
decreases
0.07
-pack
0.07
dues
0.07
vin
0.06
pore
0.06
rich
0.06
happily
0.06
hous
0.06
increases
0.06
Activations Density 0.017%