INDEX
Explanations
expressions of curiosity or questioning thoughts
the phrase "I wonder if" or similar speculative statements expressing uncertainty or curiosity.
New Auto-Interp
Negative Logits
lenker
-0.38
醐
-0.37
PhysRevD
-0.37
Bakgrunnsstoff
-0.37
sanitaires
-0.35
églises
-0.35
Batis
-0.34
esterni
-0.34
катерина
-0.34
Thick
-0.33
POSITIVE LOGITS
Wonder
0.96
wonder
0.96
Wonder
0.92
wonder
0.90
WONDER
0.89
wonders
0.76
why
0.66
wondering
0.66
Wonders
0.65
Wander
0.60
Activations Density 0.092%