INDEX
Negative Logits
IsContent
-0.99
Мексичка
-0.99
pleaſure
-0.97
myſelf
-0.97
itſelf
-0.93
########.
-0.92
SequentialGroup
-0.90
+#+
-0.90
dafx
-0.88
Efq
-0.88
POSITIVE LOGITS
↵↵
0.76
↵
0.75
0.75
'
0.74
’
0.66
<eos>
0.64
(
0.60
a
0.59
-
0.59
‘
0.56
Activations Density 0.023%