INDEX
Explanations
terms and phrases related to success and successful outcomes
New Auto-Interp
Negative Logits
alla
-0.18
_success
-0.17
plode
-0.17
emer
-0.16
Success
-0.16
success
-0.16
_succ
-0.16
Success
-0.15
lesc
-0.15
succeeding
-0.15
POSITIVE LOGITS
ness
0.28
outcome
0.28
outcomes
0.24
completion
0.24
ive
0.23
Outcome
0.23
outcome
0.22
completion
0.22
Outcome
0.22
mente
0.20
Activations Density 0.037%