INDEX
Explanations
instances where someone is surprised or amazed by a particular event or revelation
expressions of surprise or disbelief
New Auto-Interp
Negative Logits
hemor
-0.69
interstitial
-0.60
bona
-0.60
osi
-0.59
Annotations
-0.58
bern
-0.58
accompan
-0.58
assum
-0.58
ilon
-0.58
winner
-0.57
POSITIVE LOGITS
how
0.95
by
0.75
surprised
0.74
seeing
0.74
when
0.67
how
0.67
aback
0.67
discover
0.65
amaz
0.64
HOW
0.64
Activations Density 0.061%