INDEX
Explanations
mentions of "loss" of various kinds
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
303
+0.14
0.5%
130
+0.13
0.5%
662
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
130
+0.14
0.04
966
+0.13
0.04
303
+0.11
0.04
Negative Logits
/>";
-0.46
""}
-0.46
olverine
-0.44
ecuted
-0.41
possano
-0.41
Vere
-0.41
commodations
-0.40
vere
-0.40
pectives
-0.39
esistono
-0.38
POSITIVE LOGITS
Loss
1.16
loss
1.15
Loss
1.15
loss
1.13
Losses
1.09
LOSS
1.06
losses
1.04
Losses
1.00
lost
0.95
LOST
0.95
Activations Density 0.089%