INDEX
Negative Logits
Hepinize
1.11
(!)
1.10
(!
1.09
!:
1.06
¡
1.03
(!)
1.00
!,
0.93
!..
0.93
¡
0.89
!,
0.89
POSITIVE LOGITS
though
0.96
Though
0.92
though
0.88
which
0.88
which
0.83
there
0.81
yg
0.78
they
0.77
说是
0.76
purview
0.75
Activations Density 0.010%