INDEX
Negative Logits
luas
-0.08
.Fr
-0.08
.Shapes
-0.08
diabetic
-0.08
tires
-0.08
높은
-0.07
espe
-0.07
bea
-0.07
Luc
-0.07
далеко
-0.07
POSITIVE LOGITS
означ
0.09
denying
0.09
reject
0.08
criticizing
0.08
competente
0.08
penal
0.08
Kent
0.08
Penal
0.08
equivalente
0.08
existential
0.07
Activations Density 0.012%