INDEX
Explanations
code snippets in a programming language
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.17
0.6%
1967
+0.14
0.5%
876
+0.14
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.17
0.04
876
+0.14
0.00
1222
+0.14
0.03
Negative Logits
,
-0.70
.
-0.69
West
-0.66
Saint
-0.65
So
-0.64
J
-0.63
QMetaType
-0.63
St
-0.63
so
-0.62
Jets
-0.61
POSITIVE LOGITS
!...
1.59
fordable
1.47
chande
1.42
michelin
1.41
dises
1.40
vogli
1.39
parma
1.39
?...
1.38
!!</
1.35
seiz
1.34
Activations Density 0.069%