INDEX
Explanations
numbers and mathematical operations in code-related contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.16
0.5%
1108
+0.14
0.4%
1343
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
453
+0.16
0.04
1108
+0.14
0.03
1036
+0.12
0.01
Negative Logits
disagre
-2.26
apprehen
-2.23
intersper
-2.21
shenan
-2.19
encomp
-2.10
reluct
-2.08
affor
-2.06
volunte
-1.99
snoopy
-1.97
gaily
-1.97
POSITIVE LOGITS
كومونز
0.80
FailureListener
0.71
0.70
0.67
mathrm
0.67
0.67
Obrador
0.66
0.66
0.66
<u>
0.66
Activations Density 0.072%