INDEX
Explanations
quotes or dialogue within the text
New Auto-Interp
Negative Logits
-
-0.22
%s
-0.22
/
-0.20
<br
-0.19
 
-0.18
\n
-0.18
*
-0.17
%d
-0.17
_C
-0.16
_F
-0.16
POSITIVE LOGITS
the
0.33
a
0.31
â̦
0.29
â̦
0.29
it
0.28
â̦.
0.27
it
0.27
Âħ
0.27
we
0.27
if
0.27
Activations Density 0.265%