INDEX
Negative Logits
Autoritní
-0.87
مرئيه
-0.67
GEBURTS
-0.62
kefir
-0.62
metros
-0.61
Personensuche
-0.61
IsContent
-0.59
रीदारी
-0.59
Picchu
-0.59
whiteness
-0.59
POSITIVE LOGITS
LikeLike
0.41
urit
0.41
anzi
0.41
extAlignment
0.40
ibration
0.39
bland
0.39
\]
0.36
-
0.36
dubbo
0.36
ضان
0.35
Activations Density 0.003%