INDEX
Explanations
questions or statements posing inquiries
questions posed in the text
New Auto-Interp
Negative Logits
ventures
-0.76
"},"
-0.69
itches
-0.65
urous
-0.65
CBC
-0.63
parap
-0.62
ucl
-0.59
Late
-0.56
UCH
-0.55
reau
-0.55
POSITIVE LOGITS
whether
1.34
why
1.14
how
1.12
WHY
1.07
whether
1.01
moot
0.88
why
0.85
what
0.83
whats
0.80
unanswered
0.79
Activations Density 0.066%