INDEX
Explanations
phrases indicating surprise or disbelief
Follows negations or expressions of uncertainty
not surprising
New Auto-Interp
Negative Logits
itſelf
-0.89
Theſe
-0.85
Reſ
-0.84
themſelves
-0.83
purpoſe
-0.79
ſche
-0.77
pleaſure
-0.77
himſelf
-0.77
UnknownFields
-0.75
ſeveral
-0.75
POSITIVE LOGITS
wonder
2.01
wonder
1.55
Wonder
1.42
Wonder
1.27
wonders
1.27
WONDER
1.23
难怪
1.12
surprise
1.09
wondered
1.02
wondering
0.98
Activations Density 0.120%