INDEX
Explanations
descriptions of physical movements and positioning
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.18
0.6%
1741
+0.13
0.4%
906
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
131
+0.18
0.04
1526
+0.13
0.04
448
+0.11
0.04
Negative Logits
apprehen
-0.63
vainly
-0.62
Whence
-0.57
nobly
-0.55
gaily
-0.55
indescri
-0.53
unspeak
-0.53
tolerably
-0.53
ineffec
-0.52
sgn
-0.51
POSITIVE LOGITS
own
0.73
reputa
0.57
brille
0.57
vinci
0.55
cluse
0.54
bunda
0.53
ché
0.53
bebes
0.52
dè
0.52
hamburg
0.52
Activations Density 0.228%