INDEX
Explanations
statements of certainty or assurance
the phrase "there's no" and its variations, indicating a negation or absence of something
New Auto-Interp
Negative Logits
ean
-0.76
late
-0.74
aug
-0.72
romeda
-0.71
onduct
-0.70
NF
-0.69
atl
-0.69
lish
-0.68
cheon
-0.68
eds
-0.67
POSITIVE LOGITS
doubt
1.32
denying
1.27
shortage
1.27
excuse
1.23
reason
1.16
contradiction
1.09
shame
1.09
downside
1.07
question
1.06
justification
1.02
Activations Density 0.066%