INDEX
Negative Logits
Ro
-0.83
Sa
-0.73
Or
-0.67
Tr
-0.65
H
-0.63
I
-0.61
or
-0.60
Ho
-0.60
Bar
-0.58
P
-0.57
POSITIVE LOGITS
ſelves
1.10
pleaſure
1.10
ſelf
1.07
Monfieur
1.07
myſelf
1.05
Majefty
1.01
purpoſe
1.00
ſmall
1.00
itſelf
0.99
cauſe
0.97
Activations Density 0.156%