INDEX
Explanations
information related to events, organizations, and policy changes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1961
+0.13
0.4%
411
+0.13
0.4%
878
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1961
+0.13
0.04
411
+0.13
0.03
878
+0.12
0.04
Negative Logits
ftre
-1.31
fup
-1.31
ftu
-1.30
fta
-1.29
guarante
-1.28
vns
-1.27
laft
-1.25
poff
-1.21
fign
-1.21
reft
-1.20
POSITIVE LOGITS
;
1.02
$;
0.92
];
0.92
.;
0.91
>;
0.90
}$;
0.90
);
0.90
%;
0.90
;
0.88
;
0.87
Activations Density 0.098%