INDEX
Negative Logits
DP
-0.75
sumpay
-0.67
chtete
-0.66
Cornish
-0.64
TPR
-0.63
ussis
-0.63
DP
-0.61
femininas
-0.61
pouvoit
-0.60
peuples
-0.60
POSITIVE LOGITS
veras
0.52
sin
0.50
formik
0.49
sea
0.49
awtextra
0.48
fum
0.47
WriteTagHelper
0.47
din
0.45
Autoritní
0.45
barat
0.45
Activations Density 0.095%