INDEX
Explanations
uncertainty or lack of knowledge, especially in the form of questions
expressions of uncertainty or lack of knowledge
New Auto-Interp
Negative Logits
ueller
-0.71
exting
-0.71
oided
-0.66
ammers
-0.65
wagen
-0.64
stro
-0.63
ccording
-0.62
aez
-0.62
onding
-0.61
tein
-0.61
POSITIVE LOGITS
why
1.34
how
1.32
whether
1.27
what
1.19
if
1.16
anything
1.06
anymore
1.04
WHY
1.04
why
1.01
exactly
1.01
Activations Density 0.041%