INDEX
Negative Logits
Efq
-1.86
Theſe
-1.84
myſelf
-1.80
itſelf
-1.73
Monfieur
-1.66
raiſ
-1.62
Jefus
-1.57
pleaſure
-1.49
whoſe
-1.47
ſeveral
-1.47
POSITIVE LOGITS
1.21
2
1.04
↵
0.88
1
0.85
<eos>
0.83
I
0.82
↵↵
0.81
...
0.80
…
0.79
P
0.77
Activations Density 0.006%