INDEX
Negative Logits
myſelf
-1.01
leſs
-1.00
leaſt
-0.99
Efq
-0.99
ſta
-0.97
pleaſure
-0.97
greateſt
-0.96
ſmall
-0.93
houſe
-0.91
preſent
-0.91
POSITIVE LOGITS
For
1.03
Of
0.81
To
0.80
In
0.79
of
0.75
↵↵
0.75
And
0.72
↵
0.71
0.70
<eos>
0.69
Activations Density 0.141%