INDEX
Explanations
verbs related to physical movement
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
544
+0.12
0.4%
1758
+0.12
0.4%
1950
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1758
+0.12
0.05
1950
+0.12
0.04
1178
+0.11
0.04
Negative Logits
cactus
-0.43
Revert
-0.39
Thresh
-0.37
biscuit
-0.37
Scor
-0.37
printout
-0.37
المعيارى
-0.37
Deviation
-0.37
Unlocked
-0.36
canary
-0.36
POSITIVE LOGITS
move
1.05
move
1.03
moves
1.01
moved
1.00
moving
0.96
moved
0.95
MOVE
0.93
moving
0.92
Move
0.92
MOVING
0.91
Activations Density 0.128%