INDEX
Explanations
references to questions and answers related to technical information, specifically in a structured format
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
1.0%
32
+0.14
0.8%
920
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
32
+0.18
0.04
920
+0.14
0.04
1480
+0.10
0.03
Negative Logits
<bos>
-3.26
-0.95
/*
-0.93
ⓧ
-0.88
/**
-0.82
<?
-0.73
//
-0.67
///**
-0.66
HasIndex
-0.64
DeleteMapping
-0.64
POSITIVE LOGITS
maneu
1.85
affor
1.55
Juf
1.49
erad
1.43
increa
1.41
shenan
1.40
Minang
1.39
fortn
1.37
wien
1.37
reluct
1.37
Activations Density 0.085%