INDEX
Explanations
LaTeX math mode symbols
Math formulas
New Auto-Interp
Negative Logits
-
-0.93
,
-0.90
.
-0.80
-0.72
'
-0.71
!
-0.70
:
-0.68
;
-0.68
(
-0.67
...
-0.67
POSITIVE LOGITS
itſelf
1.45
Monfieur
1.40
myſelf
1.40
Theſe
1.38
Efq
1.32
pleaſure
1.28
ſtate
1.24
auffi
1.23
Jefus
1.23
raiſ
1.22
Activations Density 1.220%