INDEX
Explanations
the term "further" and its variations, indicating an exploration or expansion of ideas
New Auto-Interp
Negative Logits
further
-0.21
sel
-0.19
Further
-0.18
weiter
-0.18
ses
-0.18
run
-0.17
furthermore
-0.17
shan
-0.16
farther
-0.16
sen
-0.16
POSITIVE LOGITS
ance
0.37
ing
0.35
ado
0.29
most
0.29
-reaching
0.28
ed
0.26
nore
0.23
hin
0.22
-more
0.22
ANCE
0.22
Activations Density 0.025%