INDEX
Explanations
mentions of failed attempts or actions
instances of the word "failed" in various contexts
New Auto-Interp
Negative Logits
alon
-0.81
ollen
-0.80
erie
-0.80
region
-0.77
arya
-0.74
erning
-0.72
lua
-0.71
anguage
-0.70
istics
-0.70
rence
-0.69
POSITIVE LOGITS
attempt
0.98
attempts
0.96
merger
0.91
coup
0.87
pregnancy
0.87
experiment
0.84
bidder
0.82
pregnancies
0.81
reconciliation
0.79
candidate
0.78
Activations Density 0.077%