INDEX
Explanations
cryptographic terms or identifiers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.2%
876
+0.26
1.2%
1499
+0.15
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
876
+0.27
-0.03
1499
+0.26
0.12
283
+0.15
-0.01
Negative Logits
<bos>
-2.35
at
-0.96
put
-0.95
let
-0.94
don
-0.93
had
-0.92
so
-0.92
once
-0.91
,
-0.90
now
-0.90
POSITIVE LOGITS
effe
3.75
increa
3.62
affor
3.51
guarante
3.48
desir
3.44
maneu
3.44
wien
3.41
aen
3.36
inev
3.33
thut
3.31
Activations Density 1.519%