INDEX
Explanations
phrases related to assembling or putting things together
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
549
+0.10
0.3%
849
+0.09
0.3%
872
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1372
+0.10
0.04
156
+0.09
0.03
976
+0.09
0.03
Negative Logits
Ename
-0.99
hairc
-0.98
Pamph
-0.92
Whence
-0.91
Engraved
-0.85
Stretcher
-0.85
oleo
-0.84
unwarran
-0.83
Eft
-0.81
ecru
-0.79
POSITIVE LOGITS
pieces
0.98
puzzle
0.88
piece
0.82
piec
0.82
pieces
0.78
fragments
0.76
jigsaw
0.74
assembled
0.71
rompecabezas
0.70
piece
0.69
Activations Density 0.313%