INDEX
Negative Logits
i
-0.56
for
-0.52
y
-0.51
-0.50
sp
-0.48
the
-0.46
s
-0.46
an
-0.44
pade
-0.44
dır
-0.44
POSITIVE LOGITS
myſelf
1.33
Efq
1.31
Houſe
1.28
itſelf
1.26
Majefty
1.24
pleaſure
1.24
themſelves
1.20
Jefus
1.20
faſt
1.20
houſe
1.20
Activations Density 0.060%