INDEX
Explanations
the word "be" in various contexts
phrases beginning with "don't be."
New Auto-Interp
Negative Logits
inally
-0.86
yip
-0.84
Enhancement
-0.75
ciation
-0.73
hail
-0.69
notch
-0.69
ordes
-0.68
plings
-0.66
ciating
-0.65
¶
-0.64
POSITIVE LOGITS
confused
1.06
bothered
1.06
fooled
1.02
mistaken
0.98
intimidated
0.97
underestimated
0.95
ashamed
0.95
swayed
0.92
afraid
0.91
alarmed
0.91
Activations Density 0.100%