INDEX
Explanations
phrases related to options or possibilities
questions or conditional phrases indicating possibilities
New Auto-Interp
Negative Logits
then
-0.66
/
-0.62
estamp
-0.61
ords
-0.61
Prairie
-0.60
then
-0.59
THEN
-0.59
WATCHED
-0.58
onds
-0.55
ogun
-0.54
POSITIVE LOGITS
simply
1.16
outright
1.08
merely
1.04
just
0.92
abouts
0.92
alternatively
0.92
altogether
0.92
downright
0.90
Else
0.88
unlucky
0.79
Activations Density 0.286%