INDEX
Explanations
instances of verbs related to movement
instances of the verb "ran"
New Auto-Interp
Negative Logits
activation
-0.72
alty
-0.72
yet
-0.71
omical
-0.71
illusion
-0.71
olia
-0.68
ĨĴ
-0.65
rent
-0.65
Lauder
-0.64
maturity
-0.63
POSITIVE LOGITS
Runner
0.88
swick
0.87
RUN
0.86
running
0.77
runner
0.75
running
0.74
Running
0.73
lasses
0.73
runners
0.73
runner
0.73
Activations Density 0.015%