INDEX
Explanations
references to the Python programming language and related libraries
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.29
1.7%
410
+0.13
0.7%
321
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
109
+0.29
0.01
431
+0.13
0.01
298
+0.13
0.02
Negative Logits
ĥ½
-1.80
·
-1.77
ľĵ
-1.72
uality
-1.69
Ĭ
-1.56
ľ
-1.54
ı
-1.52
ĭ
-1.50
ori
-1.49
ĺ
-1.47
POSITIVE LOGITS
bing
1.81
erals
1.61
interpreter
1.60
tutorials
1.51
versions
1.48
80211
1.47
Script
1.45
tutorial
1.45
cias
1.44
version
1.43
Activations Density 0.077%