INDEX
Explanations
discussions surrounding expectations, opinions, and realizations in various contexts
New Auto-Interp
Negative Logits
'},
-1.25
".
-1.23
"):
-1.23
")){
-1.21
"},
-1.20
"){
-1.17
'),
-1.15
)"),
-1.14
"),
-1.14
ſelves
-1.12
POSITIVE LOGITS
.
1.30
,
1.22
!
1.15
;
1.14
?
0.99
:
0.78
!!
0.77
(
0.71
in
0.69
!!!
0.68
Activations Density 0.957%