INDEX
Explanations
code-related terms and instructions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1778
+0.15
0.6%
90
+0.12
0.5%
893
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1778
+0.15
0.03
893
+0.12
0.02
90
+0.12
0.02
Negative Logits
<bos>
-0.62
addFlags
-0.48
setFlags
-0.46
-------------</
-0.45
zeta
-0.44
***!
-0.44
iança
-0.43
RLock
-0.43
ябрь
-0.43
Slf
-0.43
POSITIVE LOGITS
Performed
1.08
Perform
1.07
PERFORM
1.01
perform
0.99
Performing
0.98
perform
0.96
Performed
0.96
perfon
0.95
ftu
0.93
Perform
0.93
Activations Density 0.047%