INDEX
Negative Logits
h
0.42
lingu
0.41
ள்ளார்
0.40
designer
0.40
A
0.40
oulis
0.39
ѕ
0.39
asta
0.37
s
0.37
mobil
0.37
POSITIVE LOGITS
ン
0.61
treaties
0.52
öğrend
0.48
ურთიერთ
0.47
ły
0.47
教程
0.46
compartilh
0.46
Heter
0.44
Nội
0.44
0.43
Activations Density 0.000%