INDEX
Explanations
instances of the word "connecting"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
376
+0.20
1.1%
457
+0.16
0.9%
256
+0.16
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
256
+0.20
0.01
457
+0.16
0.01
329
+0.16
0.01
Negative Logits
onse
-1.53
ausing
-1.39
recent
-1.36
ying
-1.35
committed
-1.34
iding
-1.34
cele
-1.33
ellow
-1.32
famous
-1.32
ooting
-1.31
POSITIVE LOGITS
thereto
2.03
them
1.95
feit
1.87
thon
1.80
dots
1.71
gaps
1.66
parts
1.52
him
1.51
equations
1.47
obstacles
1.45
Activations Density 0.005%