INDEX
Explanations
instances of the word "surprise" and related concepts
New Auto-Interp
Negative Logits
schirm
-0.72
salle
-0.71
)_{-0.66
bucket
-0.65
futbolista
-0.64
ArrowToggle
-0.63
helial
-0.62
iverr
-0.61
Polk
-0.61
Laub
-0.61
POSITIVE LOGITS
Surprise
1.20
surprise
1.13
surprise
0.99
Surprise
0.99
surprises
0.94
surprising
0.93
surpris
0.93
surprised
0.92
sterious
0.84
sympy
0.82
Activations Density 0.052%