INDEX
Explanations
phrases emphasizing the word "whatever" in various contexts
New Auto-Interp
Negative Logits
everything
-0.21
ses
-0.19
Everything
-0.17
everything
-0.15
Everything
-0.15
Things
-0.15
all
-0.14
altogether
-0.14
alles
-0.14
tudo
-0.14
POSITIVE LOGITS
else
0.29
floats
0.24
anyone
0.23
anybody
0.22
happens
0.21
else
0.20
Float
0.19
reason
0.19
ELSE
0.19
happened
0.19
Activations Density 0.047%