INDEX
Negative Logits
妇女
0.82
chargers
0.79
尙
0.79
깆
0.79
does
0.76
doesnt
0.76
डाइ
0.76
Exclude
0.72
उपसर्ग
0.70
讹
0.70
POSITIVE LOGITS
amicable
0.81
innerhalb
0.76
بینی
0.74
increased
0.74
எழுது
0.72
SMI
0.70
々は
0.69
群体
0.69
々の
0.67
kube
0.67
Activations Density 0.045%