INDEX
Explanations
historical and biographical information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.18
0.6%
1343
+0.16
0.5%
845
+0.15
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
845
+0.18
0.06
198
+0.16
0.07
1967
+0.15
0.05
Negative Logits
.
-0.79
@[+][
-0.66
))){-0.65
)))));
-0.64
PCell
-0.64
.’
-0.62
</h1>
-0.62
.”
-0.62
])).
-0.61
.“
-0.60
POSITIVE LOGITS
shenan
1.56
scrat
1.53
indestru
1.51
unspeak
1.49
inconce
1.44
reluct
1.44
cushi
1.43
increa
1.42
hairc
1.42
disreg
1.40
Activations Density 0.886%