INDEX
Negative Logits
stuff
-0.10
Hos
-0.08
Stuff
-0.08
Germ
-0.08
Uml
-0.08
yom
-0.08
Subway
-0.07
dito
-0.07
Amor
-0.07
spr
-0.07
POSITIVE LOGITS
eisen
0.08
mot
0.08
aminen
0.07
obr
0.07
দের
0.07
ymmetric
0.07
tahan
0.07
(es
0.07
icuous
0.07
morally
0.07
Activations Density 0.017%