INDEX
Explanations
instances of the verb "walk" in various forms
New Auto-Interp
Negative Logits
upward
-0.19
éϵ
-0.16
座
-0.16
wards
-0.15
Extras
-0.14
rek
-0.14
itu
-0.14
inward
-0.14
climbing
-0.13
Extern
-0.13
POSITIVE LOGITS
past
0.29
down
0.27
across
0.26
bare
0.23
through
0.23
around
0.22
along
0.21
past
0.20
circles
0.19
Past
0.19
Activations Density 0.145%