INDEX
Explanations
the word "rather" and phrases related to preference or choice
expressions of preference or choices
New Auto-Interp
Negative Logits
rising
-0.76
calling
-0.72
oward
-0.70
emin
-0.68
apt
-0.67
Loading
-0.63
DAQ
-0.62
sembly
-0.62
essential
-0.62
ahead
-0.60
POSITIVE LOGITS
spend
1.08
lose
1.06
starve
1.03
avoid
0.98
die
0.95
stay
0.93
gamble
0.91
rely
0.89
endure
0.89
be
0.89
Activations Density 0.046%