INDEX
Negative Logits
itſelf
-1.31
raiſ
-1.28
Reſ
-1.27
myſelf
-1.25
Monfieur
-1.24
Efq
-1.21
pleaſure
-1.18
tranſ
-1.15
Diſ
-1.15
ſche
-1.14
POSITIVE LOGITS
.
0.60
</em>
0.56
?
0.52
,
0.50
</i>
0.50
!
0.49
:
0.49
(
0.48
G
0.47
yo
0.46
Activations Density 0.021%