INDEX
Explanations
phrases indicating movement or direction
New Auto-Interp
Negative Logits
both
-0.23
both
-0.18
all
-0.16
Both
-0.15
ledge
-0.14
BOTH
-0.14
everything
-0.14
thing
-0.14
ining
-0.14
intrinsic
-0.14
POSITIVE LOGITS
town
0.28
town
0.21
civilization
0.20
darkest
0.19
heaven
0.19
camp
0.19
headquarters
0.19
bed
0.18
jail
0.18
church
0.18
Activations Density 0.706%