INDEX
Explanations
questions starting with "why" and "why isn't"
negative contractions used in rhetorical questions
New Auto-Interp
Negative Logits
urst
-0.84
isers
-0.70
fter
-0.65
iser
-0.65
zer
-0.64
Intent
-0.64
understatement
-0.64
Invalid
-0.62
Operator
-0.61
oric
-0.61
POSITIVE LOGITS
itia
0.75
hin
0.71
prosecute
0.69
assimil
0.69
ippi
0.67
amus
0.66
mention
0.66
vacc
0.63
nailed
0.63
promptly
0.63
Activations Density 0.197%