INDEX
Explanations
mentions of events, actions, or decisions that did not succeed
instances of the word "failed."
New Auto-Interp
Negative Logits
enfranch
-0.78
selves
-0.70
order
-0.69
til
-0.69
Cloud
-0.67
tip
-0.66
gon
-0.66
agon
-0.66
Forward
-0.65
orthy
-0.65
POSITIVE LOGITS
miser
1.22
fail
0.94
failures
0.90
failure
0.87
fail
0.85
fails
0.85
horribly
0.85
failed
0.83
attempt
0.83
failing
0.80
Activations Density 0.011%