INDEX
Explanations
mathematical expressions and symbols in the text
New Auto-Interp
Negative Logits
"
-0.58
-
-0.55
.
-0.52
...
-0.52
↵
-0.51
+
-0.49
P
-0.47
,
-0.45
2
-0.45
te
-0.45
POSITIVE LOGITS
ſelf
1.02
nahilalakip
1.02
myſelf
0.93
Reſ
0.90
itſelf
0.90
ſelves
0.86
defaultstate
0.86
GenerationType
0.85
neceff
0.81
raiſ
0.80
Activations Density 0.407%