INDEX
Explanations
instances of the word "what" and related phrases indicating inquiry or emphasis
New Auto-Interp
Negative Logits
Anything
-0.16
éĤ£ç§į
-0.16
illo
-0.15
ë©
-0.15
anything
-0.15
á»ģn
-0.15
GIN
-0.15
ÙĤد
-0.15
anything
-0.15
klä
-0.14
POSITIVE LOGITS
happens
0.18
we
0.17
happened
0.16
729
0.15
happen
0.15
happening
0.15
ultimately
0.14
ppe
0.14
zier
0.14
ãĥ³ãĥĸ
0.14
Activations Density 0.027%