INDEX
Explanations
HTML formatting elements and attributes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.16
0.7%
2019
+0.14
0.6%
381
+0.11
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1699
+0.16
0.04
605
+0.14
0.01
1445
+0.11
0.03
Negative Logits
<bos>
-2.10
quitted
-1.09
trod
-1.02
intersper
-0.99
<?
-0.95
frow
-0.90
reconno
-0.89
gild
-0.89
ⓧ
-0.89
alre
-0.88
POSITIVE LOGITS
)">
0.72
}`}>
0.67
">
0.66
morales
0.66
capulco
0.66
}}">
0.65
?>">
0.64
funghi
0.63
<caption>
0.63
ledad
0.63
Activations Density 0.089%