INDEX
Explanations
phrases or sentences where the word "no" is prominently featured
instances of negation or refusal
New Auto-Interp
Negative Logits
Sensor
-0.63
ridor
-0.62
irez
-0.62
Extras
-0.61
umen
-0.59
apeake
-0.59
andise
-0.59
arbon
-0.58
mud
-0.57
ortment
-0.57
POSITIVE LOGITS
whatsoever
1.04
ody
0.81
urnal
0.69
onsense
0.68
nor
0.63
dime
0.61
hesitation
0.61
coercion
0.60
pudding
0.60
ás
0.59
Activations Density 0.199%