INDEX
Negative Logits
Efq
-1.45
pleaſure
-1.41
myſelf
-1.40
Theſe
-1.38
itſelf
-1.37
ſelf
-1.35
purpoſe
-1.33
ſche
-1.30
ſtate
-1.30
houſe
-1.30
POSITIVE LOGITS
0.70
(
0.67
'
0.66
in
0.64
W
0.63
,
0.63
S
0.60
[
0.60
O
0.58
’
0.58
Activations Density 0.114%