INDEX
Explanations
mentions or instances of specific options
references to various choices or alternatives
New Auto-Interp
Negative Logits
mind
-0.78
hemat
-0.77
soever
-0.72
olds
-0.69
rollers
-0.69
ilipp
-0.67
tub
-0.67
orks
-0.67
urst
-0.66
grave
-0.66
POSITIVE LOGITS
options
1.05
option
0.97
finder
0.84
Option
0.83
Option
0.81
choices
0.81
atives
0.80
choice
0.75
nels
0.73
Altern
0.71
Activations Density 0.026%