INDEX
Explanations
references to locations or addresses
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
528
+0.11
0.4%
50
+0.10
0.4%
1492
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.11
0.04
349
+0.10
0.03
871
+0.08
0.02
Negative Logits
<bos>
-0.78
public
-0.69
</tbody>
-0.64
establish
-0.63
/*
-0.63
activate
-0.63
overcome
-0.60
//
-0.60
ensure
-0.60
improve
-0.60
POSITIVE LOGITS
k
1.78
k
1.74
accla
1.52
affor
1.51
unden
1.50
unlaw
1.46
bourgeo
1.45
embodi
1.45
emphat
1.40
guarante
1.37
Activations Density 0.166%