INDEX
Negative Logits
ſelf
-0.99
Majefty
-0.94
Efq
-0.89
]--;
-0.88
Jefus
-0.84
ſelves
-0.82
Means
-0.82
houſe
-0.82
greateſt
-0.82
Houſe
-0.81
POSITIVE LOGITS
e
0.86
o
0.72
a
0.72
y
0.62
of
0.60
ever
0.60
el
0.51
how
0.50
it
0.49
top
0.49
Activations Density 0.673%