INDEX
Negative Logits
glories
1.31
usefulness
1.27
pesky
1.24
i
1.19
oferty
1.16
vandalism
1.15
finitely
1.15
unpredict
1.15
harassing
1.10
hesitation
1.09
POSITIVE LOGITS
้น
0.95
punte
0.92
ة
0.92
cáps
0.92
ция
0.88
্লোক
0.87
య
0.86
perquè
0.86
හ
0.85
Ÿ
0.84
Activations Density 0.000%