INDEX
Explanations
the division symbol '/'
New Auto-Interp
Negative Logits
/
-2.06
,
-1.31
.
-1.16
(
-1.06
or
-0.99
-
-0.94
“
-0.91
(
-0.85
a
-0.82
and
-0.82
POSITIVE LOGITS
<bos>
2.14
myſelf
1.80
Theſe
1.79
itſelf
1.69
Monfieur
1.65
auffi
1.55
ainfi
1.53
ſelf
1.51
Reſ
1.48
themſelves
1.47
Activations Density 0.460%