INDEX
Negative Logits
unary
0.41
untenable
0.39
unwillingness
0.38
perturbation
0.38
fragmentary
0.37
repug
0.37
culp
0.37
assail
0.37
irradiated
0.36
grammatical
0.36
POSITIVE LOGITS
Terbaik
0.61
terbaik
0.51
kaufen
0.51
baratos
0.49
Terbaru
0.48
Philippines
0.48
лучшие
0.47
kopen
0.46
мыкты
0.46
terbaru
0.46
Activations Density 0.188%