INDEX
Explanations
the word "prefer" or variations of it
expressions of personal preferences
New Auto-Interp
Negative Logits
brance
-0.79
breakers
-0.73
$$
-0.68
pack
-0.66
gren
-0.66
breaks
-0.64
infeld
-0.64
breaker
-0.63
CD
-0.63
Impl
-0.63
POSITIVE LOGITS
rals
0.85
itism
0.75
ably
0.69
Mistress
0.68
quickShipAvailable
0.64
embodiments
0.63
ancy
0.63
secretaries
0.61
sticking
0.61
laus
0.61
Activations Density 0.016%