INDEX
Explanations
instances of the word "now" and variations of it indicating a sense of immediacy or urgency
New Auto-Interp
Negative Logits
er
-0.15
lle
-0.15
unate
-0.14
urs
-0.14
ped
-0.14
but
-0.14
otherwise
-0.14
did
-0.13
kers
-0.13
_ABI
-0.13
POSITIVE LOGITS
here
0.29
adays
0.27
HERE
0.21
imagine
0.20
withstanding
0.17
instead
0.17
comes
0.17
_that
0.17
suddenly
0.16
instead
0.15
Activations Density 0.029%