INDEX
Explanations
instances of the word "are" in various contexts
New Auto-Interp
Negative Logits
EVERY
-0.16
stuff
-0.16
anything
-0.16
Anything
-0.15
itself
-0.15
anything
-0.15
anda
-0.14
alles
-0.14
something
-0.14
gist
-0.14
POSITIVE LOGITS
times
0.24
few
0.23
fewer
0.23
no
0.22
two
0.21
plenty
0.20
certain
0.19
some
0.19
always
0.18
several
0.18
Activations Density 0.067%