INDEX
Explanations
encoded text and data related to product listings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.15
0.5%
732
+0.12
0.4%
1480
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1616
+0.15
0.02
1343
+0.12
0.02
732
+0.09
0.01
Negative Logits
–
-0.67
,
-0.65
his
-0.65
on
-0.63
(
-0.63
-0.62
and
-0.62
he
-0.61
“
-0.61
by
-0.60
POSITIVE LOGITS
applau
1.36
simplif
1.34
!...
1.34
jacques
1.32
marte
1.29
aquarelle
1.29
doman
1.28
ftu
1.26
?...
1.25
LIRE
1.23
Activations Density 0.042%