INDEX
Explanations
locations and directions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
597
+0.18
0.7%
680
+0.12
0.5%
1757
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
597
+0.18
0.02
814
+0.12
0.02
420
+0.12
0.02
Negative Logits
Ofic
-0.52
graphicx
-0.50
Vag
-0.49
Beaucoup
-0.48
Cependant
-0.48
astron
-0.48
Même
-0.48
Pourtant
-0.47
Dès
-0.47
Parmi
-0.47
POSITIVE LOGITS
Southwest
0.91
Southeast
0.89
Northeast
0.89
Northwest
0.86
Southwest
0.86
outheast
0.84
southwest
0.82
Northwest
0.79
northwest
0.79
Northeast
0.78
Activations Density 0.115%