INDEX
Negative Logits
-taking
-0.07
лина
-0.06
flower
-0.06
pulver
-0.06
mutually
-0.06
escape
-0.06
fracture
-0.06
parallel
-0.06
participate
-0.06
(boost
-0.06
POSITIVE LOGITS
branding
0.12
branded
0.10
Rosie
0.07
Prague
0.07
_FLAGS
0.07
бин
0.07
transgender
0.06
_sc
0.06
_RCC
0.06
BAR
0.06
Activations Density 0.003%