INDEX
Negative Logits
chter
-0.17
erk
-0.15
olie
-0.15
ôle
-0.14
erna
-0.14
tera
-0.14
etty
-0.14
Redistribution
-0.14
olib
-0.14
Blasio
-0.14
POSITIVE LOGITS
buz
0.15
iggins
0.14
AQ
0.14
karÅŁ
0.14
ÑģиÑĤ
0.14
getc
0.13
disciplinary
0.13
ména
0.13
"She
0.13
shine
0.13
Activations Density 0.015%