INDEX
Explanations
future intentions or plans mentioned in a negative light
phrases indicating improbability or denial of occurrence
New Auto-Interp
Negative Logits
guessing
-0.69
OTO
-0.68
refusing
-0.67
cule
-0.67
pledging
-0.66
likened
-0.65
withholding
-0.65
vowed
-0.65
gazing
-0.64
Reviewer
-0.64
POSITIVE LOGITS
exist
1.37
suffice
1.34
occur
1.34
affect
1.31
resonate
1.25
happen
1.24
succeed
1.24
survive
1.14
exist
1.12
translate
1.10
Activations Density 0.265%