INDEX
Explanations
phrases indicating movement from one place to another
references to paths or journeys
New Auto-Interp
Negative Logits
lishes
-0.68
actionGroup
-0.65
iston
-0.65
artif
-0.64
esson
-0.60
onom
-0.60
pse
-0.59
Artificial
-0.59
cons
-0.58
unn
-0.58
POSITIVE LOGITS
home
1.41
packing
0.98
home
0.97
Home
0.96
HOME
0.94
downstairs
0.92
west
0.89
back
0.87
thence
0.86
north
0.86
Activations Density 0.228%