INDEX
Explanations
quotations or statements attributed to individuals in the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
521
+0.15
0.5%
663
+0.14
0.5%
228
+0.14
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
521
+0.15
0.05
228
+0.14
0.05
663
+0.14
0.04
Negative Logits
kompati
-0.69
kristal
-0.59
arroll
-0.58
silikon
-0.58
Matériau
-0.55
benzin
-0.55
mikrofon
-0.55
foton
-0.54
maske
-0.54
optik
-0.54
POSITIVE LOGITS
reconno
0.76
cytoplas
0.75
macrop
0.72
<bos>
0.72
suscep
0.69
glau
0.66
jacques
0.65
says
0.64
chromos
0.64
olivia
0.63
Activations Density 0.112%