INDEX
Explanations
occurrences of the word "every."
New Auto-Interp
Negative Logits
ulate
-0.17
eworthy
-0.17
st
-0.15
ilitary
-0.14
kee
-0.14
ilent
-0.14
arith
-0.14
incare
-0.14
side
-0.14
atcher
-0.14
POSITIVE LOGITS
/all
0.21
hone
0.19
THING
0.18
things
0.17
where
0.17
ones
0.17
thin
0.16
einzel
0.16
though
0.16
ied
0.15
Activations Density 0.047%