INDEX
Explanations
numerical expressions, particularly those related to ranges or measurements
New Auto-Interp
Negative Logits
(
-0.68
in
-0.63
B
-0.59
has
-0.59
,
-0.59
.
-0.57
C
-0.57
at
-0.57
O
-0.56
“
-0.56
POSITIVE LOGITS
Monfieur
1.29
myſelf
1.23
Efq
1.20
Majefty
1.16
pleaſure
1.16
faſt
1.15
ſever
1.13
Theſe
1.13
fevere
1.10
ſeveral
1.09
Activations Density 2.482%