INDEX
Explanations
expressions of lack of options or choices
New Auto-Interp
Negative Logits
owell
-0.18
utzer
-0.15
_exempt
-0.14
coni
-0.14
ene
-0.14
itler
-0.14
ral
-0.13
atable
-0.13
654
-0.13
asma
-0.13
POSITIVE LOGITS
choice
0.29
alternative
0.27
alternatives
0.24
option
0.24
forced
0.23
choices
0.23
Choice
0.23
alternative
0.22
no
0.22
forced
0.22
Activations Density 0.089%