INDEX
Explanations
technical language related to system components and their configurations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.20
1.1%
416
+0.12
0.7%
323
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
440
+0.20
0.05
368
+0.12
0.07
35
+0.11
0.06
Negative Logits
eda
-1.71
arest
-1.54
Argued
-1.53
overnment
-1.52
ardo
-1.50
chanical
-1.50
ilities
-1.49
rong
-1.46
ERTY
-1.43
indicated
-1.43
POSITIVE LOGITS
¿½
2.94
ĺ
2.77
ĨĴ
2.76
Īĺ
2.66
¾
2.66
IJ
2.66
·¸
2.64
Ļª
2.63
¥
2.52
§
2.50
Activations Density 1.928%