INDEX
Explanations
computer code and errors
New Auto-Interp
Negative Logits
myſelf
-1.46
itſelf
-1.41
purpoſe
-1.41
ſtate
-1.34
ſtand
-1.34
Theſe
-1.33
ſche
-1.32
Monfieur
-1.32
pleaſure
-1.30
raiſ
-1.29
POSITIVE LOGITS
,
0.88
.
0.81
)
0.77
;
0.77
(
0.76
:
0.74
(
0.71
*
0.69
was
0.68
<eos>
0.67
Activations Density 1.562%