INDEX
Explanations
punctuation marks, particularly quotation marks and apostrophes
New Auto-Interp
Negative Logits
'));
-1.08
}');
-0.97
]');
-0.96
)');
-0.93
'){
-0.91
%");
-0.89
_
-0.86
...');
-0.85
'):
-0.85
/');
-0.82
POSITIVE LOGITS
“
1.60
“
1.27
("1.22
”
1.16
,“
1.15
(“
1.12
.“
1.10
"
1.10
"
1.09
="
1.09
Activations Density 0.545%