INDEX
Explanations
programming-related terms and structures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
342
+0.14
0.8%
302
+0.13
0.8%
160
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
160
+0.14
0.14
49
+0.13
0.14
408
+0.13
0.08
Negative Logits
¬
-1.70
eness
-1.63
'):
-1.55
¾
-1.51
precautions
-1.48
Īĺ
-1.47
?’
-1.44
'),
-1.44
ĨĴ
-1.37
vent
-1.35
POSITIVE LOGITS
orectal
1.63
arcoma
1.59
atte
1.58
áĢº
1.53
orta
1.49
amsbsy
1.49
orter
1.48
lei
1.45
amssymb
1.45
phia
1.45
Activations Density 4.070%