INDEX
Explanations
adjectives and adverbs that imply a critical or evaluative tone
phrases that convey criticism or sarcasm
New Auto-Interp
Negative Logits
".
-0.70
'."
-0.68
!".
-0.63
".[
-0.62
.''
-0.59
?".
-0.58
morrow
-0.58
$.
-0.58
.""
-0.57
."[
-0.57
POSITIVE LOGITS
quotes
0.80
irony
0.71
understatement
0.71
bole
0.71
cerpt
0.70
acron
0.67
disclaimer
0.67
oqu
0.66
analogy
0.66
excerpt
0.65
Activations Density 0.870%