INDEX
Negative Logits
Hello
-0.60
in
-0.59
on
-0.59
Hey
-0.55
of
-0.53
Dear
-0.51
at
-0.50
or
-0.48
del
-0.48
Alright
-0.47
POSITIVE LOGITS
Efq
0.89
houſe
0.88
myſelf
0.88
Majefty
0.88
Houſe
0.87
Monfieur
0.86
Mahomet
0.85
Shakspeare
0.84
itſelf
0.84
pleaſure
0.83
Activations Density 0.123%