INDEX
Explanations
commands related to inserting data
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.21
1.2%
430
+0.13
0.7%
344
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
148
+0.21
0.01
344
+0.13
0.02
376
+0.12
0.00
Negative Logits
Ī
-2.47
Ĥ¬
-2.04
-1.97
↵↵↵
-1.97
-1.97
č↵č↵
-1.97
↵
-1.97
↵ ³³³
-1.97
č↵
-1.97
↵ ↵
-1.97
POSITIVE LOGITS
itive
1.60
controls
1.58
ions
1.56
IONS
1.53
into
1.52
ionate
1.52
INTO
1.51
jections
1.46
able
1.46
ible
1.46
Activations Density 0.096%