INDEX
Negative Logits
hygiene
-0.08
intensive
-0.08
detox
-0.07
bear
-0.07
hack
-0.07
unrealistic
-0.07
competitions
-0.07
hed
-0.07
slick
-0.07
subsidies
-0.07
POSITIVE LOGITS
movable
0.09
fuss
0.08
-,
0.08
courrier
0.08
manfaat
0.08
mandib
0.08
nearest
0.08
ьми
0.08
今回
0.08
neighbor
0.08
Activations Density 0.019%