INDEX
Negative Logits
itſelf
-1.12
themselves
-1.10
itself
-1.07
himself
-1.07
himself
-1.03
Himself
-0.98
themselves
-0.98
themſelves
-0.97
itself
-0.97
himſelf
-0.97
POSITIVE LOGITS
can
0.52
understand
0.46
but
0.44
finally
0.43
want
0.43
have
0.42
get
0.42
if
0.42
we
0.41
know
0.41
Activations Density 0.023%