INDEX
Explanations
numbers and percentages within text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
1.1%
32
+0.10
0.5%
1096
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
80
+0.21
0.07
1590
+0.10
0.08
32
+0.10
0.07
Negative Logits
<bos>
-3.54
AssemblyTitle
-0.63
pursue
-0.62
///**
-0.62
fileprivate
-0.61
<?
-0.60
cooperate
-0.60
while
-0.59
make
-0.59
AssemblyCompany
-0.59
POSITIVE LOGITS
véhic
1.35
stockholm
1.27
napoli
1.26
incess
1.24
tramont
1.23
inev
1.23
increa
1.21
applau
1.21
maneu
1.20
Juf
1.20
Activations Density 0.320%