INDEX
Explanations
various forms of the period punctuation mark
HTML tags and special characters
New Auto-Interp
Negative Logits
️
-0.54
er
-0.49
-0.45
↵
-0.45
"
-0.43
*
-0.41
their
-0.40
s
-0.40
-0.40
\
-0.39
POSITIVE LOGITS
.<
1.22
:<
1.03
!<
0.90
)<
0.89
/<
0.89
。<
0.85
-<
0.84
;<
0.84
'<
0.83
+<
0.81
Activations Density 0.056%