INDEX
Negative Logits
코
-0.07
WD
-0.07
favourable
-0.07
kd
-0.06
posting
-0.06
uç
-0.06
zona
-0.06
arias
-0.06
کور
-0.06
King
-0.06
POSITIVE LOGITS
immature
0.09
irresponsible
0.07
childish
0.07
↵
0.06
↵
0.06
.SceneManagement
0.06
Status
0.06
camar
0.06
childcare
0.06
nevě
0.06
Activations Density 0.003%