INDEX
Explanations
technical terms related to software or hardware optimization and analysis
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.16
0.5%
1385
+0.14
0.4%
776
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
776
+0.16
0.05
147
+0.14
0.03
1559
+0.13
0.04
Negative Logits
encomp
-2.78
depic
-2.73
reluct
-2.68
guarante
-2.66
volunte
-2.61
accla
-2.60
disagre
-2.49
increa
-2.49
emphat
-2.46
purcha
-2.45
POSITIVE LOGITS
million
0.78
0.75
billion
0.72
0.70
则
0.70
млн
0.69
0.68
0.68
也
0.68
sum
0.67
Activations Density 0.153%