INDEX
Explanations
quotation marks or parentheses
New Auto-Interp
Negative Logits
'
1.95
"
1.69
"'
1.52
'"
1.34
'[
1.19
,"
1.17
"\
1.16
'.
1.16
',
1.15
''
1.15
POSITIVE LOGITS
(“
1.91
“
1.90
(“
1.88
“‘
1.84
’”
1.83
“(
1.79
“
1.77
.’”
1.70
’’
1.69
”)
1.68
Activations Density 0.072%