INDEX
Explanations
instances of code or programming-related keywords
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
471
+0.13
0.8%
376
+0.13
0.8%
241
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
471
+0.13
0.02
397
+0.13
0.02
241
+0.12
0.02
Negative Logits
·¸
-4.29
ĭ
-4.12
»¿
-3.99
¿½
-3.92
ĸ´
-3.60
±
-3.57
´
-3.42
İ
-3.42
Īĺ
-3.41
ĨĴ
-3.33
POSITIVE LOGITS
mos
1.77
vre
1.64
ios
1.58
gary
1.55
enstein
1.54
inet
1.47
DNA
1.47
pockets
1.45
endar
1.43
margins
1.43
Activations Density 0.011%