INDEX
Explanations
strings with equal signs
punctuation and formatting symbols commonly used in structured data or programming
New Auto-Interp
Negative Logits
!'
-1.18
.'
-1.17
?'
-1.16
,'
-1.08
)'
-0.92
'.
-0.89
.'"
-0.87
?'"
-0.82
!'"
-0.82
»
-0.81
POSITIVE LOGITS
"
2.10
"'
1.87
"-
1.85
"...
1.83
"[
1.77
".
1.73
"(
1.73
"_
1.69
"#
1.65
"+
1.63
Activations Density 0.183%