INDEX
Negative Logits
object
0.50
—
0.50
if
0.48
the
0.46
something
0.45
bottom
0.45
apparent
0.44
beginnings
0.44
more
0.43
unapolog
0.42
POSITIVE LOGITS
等人
0.47
heng
0.45
ȩ
0.44
ǧ
0.42
oitte
0.42
ਦਰ
0.41
arrerol
0.41
ouard
0.41
cheng
0.40
arxiv
0.39
Activations Density 0.011%