INDEX
Negative Logits
iveness
-0.78
ional
-0.75
encers
-0.64
xual
-0.63
ivity
-0.63
iment
-0.62
smanship
-0.62
uve
-0.61
itational
-0.61
HER
-0.61
POSITIVE LOGITS
rolet
0.66
ãĥĦ
0.65
ĵĺ
0.58
Bulg
0.56
ãĥĥãĥĪ
0.56
Bye
0.54
Dempsey
0.53
jection
0.52
ota
0.52
lah
0.51
Activations Density 5.437%