INDEX
Explanations
phrases related to contrasting choices or situations
conditional phrases that introduce alternatives or comparisons
New Auto-Interp
Negative Logits
SHIP
-0.81
poke
-0.80
erest
-0.77
:(
-0.74
elsen
-0.74
enez
-0.72
ilege
-0.71
blems
-0.71
ATA
-0.71
efer
-0.70
POSITIVE LOGITS
otherwise
0.92
circumst
0.79
decay
0.73
wise
0.66
powerless
0.64
grain
0.64
nons
0.63
rogens
0.63
uncon
0.63
animate
0.63
Activations Density 0.176%