INDEX
Negative Logits
‐'
-0.08
MID
-0.07
Oriental
-0.07
알
-0.07
@{-0.06
irres
-0.06
arrogant
-0.06
Alec
-0.06
поход
-0.06
pozdě
-0.06
POSITIVE LOGITS
)*/↵
0.07
certainly
0.06
created
0.06
stanov
0.06
ErrorCode
0.06
ERP
0.06
") ↵
0.06
changes
0.06
officially
0.06
ção
0.06
Activations Density 0.000%