INDEX
Explanations
the presence of document structure or formatting indicators
New Auto-Interp
Negative Logits
,
-0.79
_
-0.78
-
-0.77
/
-0.63
(
-0.59
ma
-0.58
us
-0.58
.
-0.58
...
-0.56
;
-0.56
POSITIVE LOGITS
"):
1.29
)";
1.19
')
1.18
)");
1.18
'))
1.17
"])
1.17
"]);
1.15
")));
1.15
']))
1.11
The
1.09
Activations Density 0.317%