INDEX
Negative Logits
ⅽ
-0.74
compromised
-0.73
dumped
-0.72
notable
-0.71
dotte
-0.71
włas
-0.70
noteworthy
-0.69
サロン
-0.69
retweeted
-0.69
詹
-0.68
POSITIVE LOGITS
изменения
0.69
CHF
0.69
۶
0.68
煌
0.67
Rules
0.67
PILOT
0.66
きら
0.66
OST
0.66
AKP
0.66
ണ്
0.65
Activations Density 0.046%