INDEX
Explanations
actions involving rotation or movement
New Auto-Interp
Negative Logits
riend
-0.16
steady
-0.16
voks
-0.15
ardu
-0.15
apus
-0.15
APTER
-0.15
raid
-0.14
ppo
-0.14
ÄIJo
-0.14
presso
-0.14
POSITIVE LOGITS
rotation
0.28
TURN
0.28
turn
0.27
rotate
0.27
-turn
0.26
Rotation
0.26
turns
0.25
turn
0.25
rotations
0.25
Rotate
0.25
Activations Density 0.069%