INDEX
Explanations
questions and expressions of disbelief or frustration
New Auto-Interp
Negative Logits
.nlm
-0.15
zent
-0.15
Crud
-0.14
åĵ
-0.14
OPS
-0.14
ictory
-0.14
Beard
-0.14
undry
-0.14
setSize
-0.14
koli
-0.14
POSITIVE LOGITS
why
0.34
Why
0.26
how
0.24
why
0.24
WHY
0.21
shouldn
0.20
seriously
0.20
surely
0.20
How
0.20
Why
0.19
Activations Density 0.288%