INDEX
Explanations
commented code segments that are easy to read and understand, containing ample comments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.22
0.7%
872
+0.12
0.4%
1150
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
876
+0.22
-0.01
872
+0.12
0.05
678
+0.12
0.04
Negative Logits
Cár
-0.77
kooper
-0.72
demokrat
-0.70
Demokrat
-0.68
Concepción
-0.66
Batalla
-0.63
Milán
-0.62
keramik
-0.60
Arque
-0.59
akade
-0.59
POSITIVE LOGITS
particolar
1.01
affez
1.00
disreg
0.99
overla
0.97
maneu
0.95
scrat
0.94
cabrio
0.94
intermitt
0.93
curé
0.88
sappi
0.88
Activations Density 0.380%