INDEX
Negative Logits
i
-1.33
u
-0.80
viz
-0.75
ie
-0.65
I
-0.63
t
-0.60
August
-0.54
um
-0.52
y
-0.49
k
-0.49
POSITIVE LOGITS
myſelf
1.24
purpoſe
1.22
ſche
1.16
pleaſure
1.15
houſe
1.13
ſever
1.09
itſelf
1.09
juſ
1.07
uſe
1.04
Majefty
1.01
Activations Density 0.075%