INDEX
Neuron Alignment
Index
Value
% of L₁
478
+0.17
1.0%
136
+0.12
0.6%
47
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
494
+0.17
0.04
402
+0.12
0.03
50
+0.11
0.03
Negative Logits
¢
-1.66
¸
-1.59
isor
-1.46
ior
-1.45
head
-1.44
sted
-1.44
¡
-1.44
uchi
-1.42
↵
-1.41
ized
-1.38
POSITIVE LOGITS
outright
1.70
its
1.63
other
1.63
vice
1.59
sexes
1.56
others
1.55
infinity
1.53
gger
1.48
slightest
1.43
anything
1.43
Activations Density 0.354%