INDEX
Explanations
the word "option."
references to alternative choices or options
New Auto-Interp
Negative Logits
hemat
-0.79
orks
-0.77
uzz
-0.70
ARS
-0.66
owan
-0.65
gian
-0.65
gae
-0.65
ritic
-0.64
Wars
-0.64
avia
-0.63
POSITIVE LOGITS
options
1.02
option
0.96
alternatives
0.93
Altern
0.86
finder
0.83
atives
0.82
Option
0.81
ossibility
0.78
choices
0.77
choice
0.75
Activations Density 0.027%