INDEX
Explanations
terms related to guidance, instructions, and introductions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
0.8%
1013
+0.09
0.3%
1978
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1017
+0.21
0.05
1744
+0.09
0.05
1695
+0.09
0.05
Negative Logits
<bos>
-2.28
ⓧ
-0.71
ღ
-0.65
HasColumnType
-0.63
protected
-0.62
</table>
-0.61
parcel
-0.61
public
-0.61
ਾਲ
-0.60
serve
-0.60
POSITIVE LOGITS
unlaw
1.71
squa
1.67
increa
1.66
maneu
1.64
secon
1.63
impra
1.63
affor
1.62
reluct
1.62
Minang
1.60
guarante
1.60
Activations Density 0.800%