INDEX
Explanations
phrases indicating someone's preferences or desires
the phrase "I don't want to" and its variations
New Auto-Interp
Negative Logits
soDeliveryDate
-0.88
Enhancement
-0.69
ccording
-0.68
Needs
-0.65
footed
-0.64
figured
-0.63
adder
-0.62
Compass
-0.62
reperto
-0.62
gradient
-0.61
POSITIVE LOGITS
anymore
1.44
nor
1.09
anywhere
1.03
necessarily
1.03
bother
1.02
whatsoever
0.93
anybody
0.93
any
0.92
anything
0.91
mention
0.87
Activations Density 0.190%