INDEX
Explanations
the word "wonder" used in a context indicating surprise or amazement
instances of the phrase "no wonder."
New Auto-Interp
Negative Logits
jri
-0.86
ategory
-0.83
aditional
-0.71
aution
-0.69
é¾
-0.69
iosis
-0.69
berman
-0.68
ictional
-0.68
aeper
-0.65
ĪĴ
-0.65
POSITIVE LOGITS
why
0.84
McA
0.84
ment
0.82
WHY
0.75
aloud
0.74
ioned
0.73
lessly
0.70
why
0.70
ted
0.68
MAG
0.67
Activations Density 0.010%