INDEX
Explanations
lines of code or structured data
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
330
+0.12
0.7%
19
+0.11
0.6%
420
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
420
+0.12
0.01
66
+0.11
0.01
330
+0.10
0.01
Negative Logits
ĥ½
-1.93
anonymity
-1.66
others
-1.65
nominal
-1.59
angels
-1.53
imony
-1.51
anyone
-1.50
someone
-1.48
rians
-1.48
uties
-1.46
POSITIVE LOGITS
ARRAY
1.90
iewicz
1.90
wheel
1.84
works
1.72
paste
1.70
umber
1.64
hole
1.61
orest
1.59
STRING
1.59
brew
1.57
Activations Density 0.009%