INDEX
Explanations
the word "failure."
instances of the word "failure" and its variations in the text
New Auto-Interp
Negative Logits
rete
-0.70
ople
-0.69
selves
-0.67
enta
-0.67
andise
-0.67
orthy
-0.66
eston
-0.65
wick
-0.65
iliary
-0.64
ript
-0.64
POSITIVE LOGITS
miser
1.33
afe
0.88
catast
0.85
DEV
0.77
hard
0.76
horribly
0.76
rate
0.75
dism
0.74
spectacular
0.71
lust
0.71
Activations Density 0.038%