INDEX
Explanations
the word "tried" in phrases
instances of the word "tried."
New Auto-Interp
Negative Logits
scribe
-0.76
head
-0.71
front
-0.69
ificantly
-0.66
inant
-0.65
thus
-0.65
region
-0.63
gap
-0.63
houses
-0.62
cum
-0.62
POSITIVE LOGITS
unsuccessfully
1.19
anke
0.85
nesday
0.79
tried
0.78
desperately
0.77
":"/
0.76
frantically
0.73
nir
0.73
repeatedly
0.72
ichick
0.72
Activations Density 0.040%