INDEX
Explanations
actions related to running or moving quickly
New Auto-Interp
Negative Logits
upal
-0.18
orer
-0.16
useClass
-0.15
arga
-0.15
aleigh
-0.14
dzi
-0.14
Shel
-0.14
HOOK
-0.14
HC
-0.14
iola
-0.14
POSITIVE LOGITS
è·ij
0.18
/run
0.17
RUN
0.17
imag
0.16
assing
0.16
RUN
0.15
run
0.15
races
0.15
running
0.15
(run
0.15
Activations Density 0.095%