INDEX
Explanations
queries that begin with "How do"
sentences that pose questions about actions or inquiries
New Auto-Interp
Negative Logits
Immunity
-0.72
Handling
-0.68
board
-0.66
Reviewer
-0.65
workshop
-0.65
bane
-0.64
ridden
-0.64
Jar
-0.64
cised
-0.63
mares
-0.62
POSITIVE LOGITS
omsday
1.12
ppel
0.84
herty
0.77
onga
0.73
zens
0.73
?]
0.72
ctr
0.72
kson
0.69
orman
0.69
impressions
0.68
Activations Density 0.026%