INDEX
Explanations
personal anecdotes and stories about family
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
964
+0.11
0.3%
1842
+0.10
0.3%
906
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1553
+0.11
0.07
1533
+0.10
0.02
343
+0.09
0.06
Negative Logits
umo
-0.67
Jä
-0.65
surpl
-0.64
Simult
-0.64
parall
-0.59
asos
-0.57
kasa
-0.57
Illus
-0.56
Gorb
-0.55
makro
-0.55
POSITIVE LOGITS
<bos>
0.70
parents
0.58
preco
0.56
school
0.53
dad
0.52
parents
0.52
kids
0.51
kid
0.50
PeEnEo
0.50
Parents
0.50
Activations Density 0.636%