INDEX
Explanations
code-related terms and components
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
12
+0.19
1.1%
293
+0.14
0.9%
229
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
12
+0.19
0.14
293
+0.14
0.11
337
+0.13
0.06
Negative Logits
ľĵ
-1.73
ŀ
-1.49
punk
-1.37
§
-1.36
uple
-1.34
·
-1.33
ļ
-1.32
kick
-1.30
psin
-1.30
Ĥ¬
-1.25
POSITIVE LOGITS
ings
1.58
liography
1.54
journals
1.49
ncbi
1.48
families
1.46
STA
1.44
:**
1.39
artments
1.37
usalem
1.37
BH
1.37
Activations Density 3.850%