INDEX
Explanations
numerical representations such as years, figures, and statistics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.11
0.4%
687
+0.10
0.4%
486
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
776
+0.11
0.04
559
+0.10
0.04
814
+0.08
0.03
Negative Logits
<bos>
-0.90
ⓧ
-0.70
-0.68
ുറ
-0.62
<?
-0.60
<?
-0.59
//{
-0.57
ੰ
-0.56
may
-0.56
ਿੱ
-0.55
POSITIVE LOGITS
maneu
1.76
increa
1.59
depic
1.51
stockholm
1.49
suscep
1.45
effe
1.45
emphat
1.44
disreg
1.44
guarante
1.42
inev
1.41
Activations Density 0.128%