INDEX
Explanations
instances of the word "decision" and its variations
New Auto-Interp
Negative Logits
ewitness
-0.82
rites
-0.71
antis
-0.66
ittens
-0.65
heres
-0.65
ateurs
-0.65
outh
-0.64
aired
-0.64
anti
-0.64
nces
-0.64
POSITIVE LOGITS
decision
0.95
regarding
0.92
decisions
0.92
wisely
0.89
whether
0.86
choices
0.84
accordingly
0.83
unilaterally
0.83
CHO
0.80
choice
0.79
Activations Density 0.035%