INDEX
Explanations
words related to options and choices
phrases that introduce alternatives or options
New Auto-Interp
Negative Logits
Holmes
-0.59
vulner
-0.57
horizont
-0.56
Clancy
-0.52
Malone
-0.52
Goodman
-0.51
Leaks
-0.51
Petersen
-0.50
Hub
-0.49
Rover
-0.49
POSITIVE LOGITS
Else
1.29
chard
1.28
acle
1.14
acles
1.12
acular
1.12
lando
1.11
chid
1.08
nery
1.03
else
1.00
alternatively
0.98
Activations Density 0.036%