INDEX
Explanations
individuals or actors in a scenario
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1919
+0.21
0.7%
184
+0.12
0.4%
1978
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.21
0.09
862
+0.12
0.03
513
+0.12
0.03
Negative Logits
mef
-1.86
intersper
-1.83
secon
-1.79
buc
-1.71
fluo
-1.68
lyon
-1.67
ivi
-1.67
Keny
-1.66
increa
-1.66
Augu
-1.66
POSITIVE LOGITS
chose
0.82
got
0.79
could
0.79
XmlEnum
0.78
took
0.77
can
0.77
cannot
0.76
choose
0.76
had
0.74
has
0.74
Activations Density 0.215%