INDEX
Negative Logits
myſelf
-1.27
pleaſure
-1.21
itſelf
-1.15
<bos>
-1.14
'\\;'
-1.11
Efq
-1.10
themſelves
-1.09
raiſ
-1.09
Monfieur
-1.09
Paglinawan
-1.09
POSITIVE LOGITS
0.70
:
0.62
.
0.60
of
0.57
,
0.56
(
0.52
T
0.50
A
0.50
ch
0.49
O
0.48
Activations Density 0.207%