INDEX
Explanations
references to community-related events and interactions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
288
+0.17
1.0%
104
+0.14
0.8%
118
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
288
+0.17
0.09
145
+0.14
0.10
248
+0.13
0.07
Negative Logits
harmless
-1.70
([]
-1.61
ulative
-1.53
himself
-1.52
habits
-1.50
(['
-1.49
liar
-1.48
myself
-1.47
yourselves
-1.45
into
-1.44
POSITIVE LOGITS
peak
1.83
apex
1.73
intersections
1.72
momento
1.71
intersection
1.63
periphery
1.62
center
1.62
terminus
1.61
Plaza
1.61
expense
1.53
Activations Density 1.287%