INDEX
Negative Logits
救
-0.08
rik
-0.08
/em
-0.08
fic
-0.07
(loc
-0.07
dept
-0.07
relocation
-0.07
وين
-0.07
emigr
-0.07
(for
-0.07
POSITIVE LOGITS
biography
0.09
Biography
0.09
時計
0.08
Curious
0.08
primjer
0.08
maschine
0.08
tease
0.08
uchen
0.08
reciproc
0.08
fony
0.08
Activations Density 0.001%