INDEX
Explanations
instances where someone is surprised or impressed by something
phrases indicating surprise or amazement experienced by individuals
New Auto-Interp
Negative Logits
ftime
-0.79
wipes
-0.75
teasp
-0.75
MX
-0.73
iatrics
-0.72
Runner
-0.70
hops
-0.68
ont
-0.68
ZA
-0.66
rim
-0.65
POSITIVE LOGITS
how
0.86
what
0.82
heights
0.76
¥µ
0.70
assurances
0.70
this
0.69
adopting
0.69
Īè
0.68
sheer
0.67
seeing
0.66
Activations Density 0.125%