INDEX
Explanations
the word "cut" as well as phrases related to reductions or simplifications
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
61
+0.14
0.5%
1328
+0.13
0.5%
597
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1142
+0.14
0.04
61
+0.13
0.04
569
+0.13
0.04
Negative Logits
Shells
-0.64
volunte
-0.60
Ferdinando
-0.60
Mlle
-0.59
Valerio
-0.59
Când
-0.57
Stretcher
-0.56
Mounts
-0.56
Illus
-0.56
Alejandra
-0.55
POSITIVE LOGITS
cut
1.34
cut
1.30
cuts
1.25
cutting
1.20
CUT
1.19
Cut
1.19
Cut
1.17
cutting
1.17
cuts
1.16
CUT
1.13
Activations Density 0.103%