INDEX
Explanations
URLs and website links
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.10
0.3%
856
+0.09
0.3%
227
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1799
+0.10
0.03
1143
+0.09
0.02
678
+0.09
0.04
Negative Logits
BeforeAll
-0.76
AfterEach
-0.75
يتيمه
-0.74
="&#
-0.74
vician
-0.74
Obrador
-0.73
íí
-0.72
ArgumentParser
-0.72
PathParam
-0.71
FailureListener
-0.70
POSITIVE LOGITS
reluct
2.23
encomp
2.22
affor
2.20
maneu
2.19
impra
2.17
increa
2.11
strick
2.11
guarante
2.10
accla
2.09
scrat
2.08
Activations Density 0.193%