INDEX
Explanations
terms related to usability and user experience
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.12
0.4%
1218
+0.06
0.2%
501
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.12
0.03
1137
+0.06
0.05
1870
+0.06
0.02
Negative Logits
<bos>
-1.33
/**
-0.81
-0.74
ⓧ
-0.72
//
-0.71
<?
-0.70
so
-0.67
put
-0.66
do
-0.66
#
-0.66
POSITIVE LOGITS
jaya
1.83
bandung
1.82
chèvre
1.72
matel
1.72
lele
1.68
!...
1.63
provence
1.62
wien
1.62
bordeaux
1.61
jawa
1.60
Activations Density 0.225%