INDEX
Negative Logits
eğer
0.54
anty
0.49
parking
0.47
artery
0.45
단
0.45
jeśli
0.45
丽
0.45
estén
0.44
yıld
0.43
אם
0.43
POSITIVE LOGITS
शीलता
0.50
Natives
0.46
ιο
0.45
рів
0.45
Rim
0.44
Mainland
0.44
ството
0.44
த்திற்க
0.43
emans
0.43
žal
0.42
Activations Density 0.000%