INDEX
Explanations
instances of the word "fail" with a high level of activation
contexts related to failure or the concept of failing
New Auto-Interp
Negative Logits
Crescent
-0.63
ours
-0.63
Sands
-0.61
orney
-0.61
oust
-0.60
rams
-0.60
our
-0.60
Sterling
-0.60
arm
-0.57
Adrian
-0.57
POSITIVE LOGITS
fail
3.75
fail
2.41
fails
2.37
Fail
2.36
Fail
2.34
failure
2.02
succeed
1.85
failed
1.84
failing
1.80
Failure
1.78
Activations Density 0.013%