INDEX
Explanations
questions or statements related to curiosity or uncertainty
instances of the word "wondering."
New Auto-Interp
Negative Logits
ater
-0.58
roup
-0.57
Era
-0.55
Pro
-0.55
et
-0.55
Lead
-0.54
contained
-0.54
late
-0.53
CONTR
-0.53
lay
-0.52
POSITIVE LOGITS
wondering
3.46
wondered
1.91
wonder
1.85
thinking
1.56
hoping
1.54
contemplating
1.50
guessing
1.45
fearing
1.40
noticing
1.40
curious
1.39
Activations Density 0.010%