INDEX
Explanations
phrases related to actions or events occurring in the past
references to previous mentions or context in a discussion
New Auto-Interp
Negative Logits
`.
-1.23
..
-1.23
`,
-1.12
..."
-1.01
...
-1.00
`
-1.00
"]=>
-0.99
//
-0.97
..............
-0.95
......
-0.92
POSITIVE LOGITS
—
2.54
)—
2.35
—"
2.34
"—
2.26
—-
1.83
.—
1.78
,—
1.64
——
1.44
Advertisement
1.38
--
1.30
Activations Density 0.667%