INDEX
Negative Logits
代
-0.31
Hos
-0.28
EMENT
-0.28
ISTER
-0.28
Presbyterian
-0.28
UN
-0.27
IRO
-0.27
çͰ
-0.26
TOR
-0.26
Ms
-0.26
POSITIVE LOGITS
nost
0.29
idal
0.29
tail
0.28
¥ŀ
0.27
unden
0.27
undermin
0.27
booze
0.27
manif
0.26
imperson
0.26
phr
0.26
Activations Density 0.086%