INDEX
Explanations
maids followed by punctuation
New Auto-Interp
Negative Logits
:
0.45
\|
0.44
$,
0.44
:
0.43
\"\
0.43
\"
0.42
\|
0.42
{}'.0.42
\")
0.40
\"
0.40
POSITIVE LOGITS
.’
0.65
.”
0.63
.“
0.58
.)
0.55
.”)
0.50
.</
0.50
."
0.48
.]
0.45
(“
0.43
,’
0.42
Activations Density 5.053%