INDEX
Explanations
programming language identifiers and code structure elements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.20
1.2%
23
+0.16
0.9%
263
+0.15
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
156
+0.20
0.68
111
+0.16
0.69
71
+0.15
0.63
Negative Logits
ème
-1.75
"}](#
-1.68
thesis
-1.53
journal
-1.46
argument
-1.46
STA
-1.43
CIT
-1.41
ле
-1.40
ikipedia
-1.38
VERTISEMENT
-1.36
POSITIVE LOGITS
keit
1.70
nights
1.60
respectively
1.52
outright
1.50
lord
1.36
thing
1.34
ously
1.32
privileges
1.30
filled
1.28
uh
1.28
Activations Density 5.423%