INDEX
Explanations
references to new updates, changes, or features in different products or technologies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.6%
1178
+0.13
0.8%
528
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
920
+0.27
0.08
2004
+0.13
0.06
689
+0.13
0.06
Negative Logits
<bos>
-2.89
<?
-0.85
/***
-0.74
lateinit
-0.73
-0.71
/*
-0.71
Vegeu
-0.71
ⓧ
-0.71
Corte
-0.67
censiti
-0.67
POSITIVE LOGITS
unlaw
2.03
Juf
1.96
disagre
1.95
wherea
1.95
increa
1.93
reluct
1.93
affor
1.89
accla
1.89
stockholm
1.87
impractica
1.84
Activations Density 0.341%