INDEX
Explanations
references to results or output values
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
245
+0.14
0.8%
365
+0.13
0.7%
5
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
80
+0.14
0.03
245
+0.13
0.03
5
+0.12
0.03
Negative Logits
npmjs
-2.00
ij
-1.77
oxacin
-1.73
]>
-1.70
ĥ½
-1.69
.""
-1.67
]{.-1.66
Č
-1.61
¶
-1.60
·¸
-1.59
POSITIVE LOGITS
set
1.66
achievable
1.60
board
1.59
strings
1.57
achieved
1.53
result
1.50
atically
1.50
ats
1.50
fet
1.49
obtained
1.49
Activations Density 0.095%