INDEX
Explanations
phrases indicating a choice or alternative
phrases that indicate alternatives or choices
New Auto-Interp
Negative Logits
then
-0.86
ocracy
-0.78
NOW
-0.76
ires
-0.72
eth
-0.69
erest
-0.67
english
-0.64
our
-0.62
ETS
-0.62
now
-0.61
POSITIVE LOGITS
ifice
1.44
chard
1.43
nam
1.43
acle
1.42
acles
1.37
chid
1.35
alternatively
1.25
otherwise
1.24
GAN
1.17
Else
1.11
Activations Density 0.169%