INDEX
Explanations
the word "is" at the beginning of a sentence
repeated phrases or structures that start with "this is."
New Auto-Interp
Negative Logits
isms
-0.67
luaj
-0.66
nesses
-0.66
selves
-0.65
interven
-0.64
forts
-0.64
cknow
-0.61
occup
-0.61
ellig
-0.60
Parameters
-0.60
POSITIVE LOGITS
my
0.94
definitely
0.84
why
0.84
gonna
0.83
how
0.83
NOT
0.78
probably
0.78
what
0.77
an
0.75
another
0.73
Activations Density 0.102%