INDEX
Explanations
numerical data points or metrics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
151
+0.12
0.6%
425
+0.11
0.6%
374
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
151
+0.12
0.01
221
+0.11
0.01
374
+0.10
0.01
Negative Logits
Imm
-1.42
herin
-1.42
enough
-1.41
hes
-1.40
ently
-1.39
respectively
-1.36
thood
-1.35
tree
-1.33
isc
-1.30
ron
-1.29
POSITIVE LOGITS
RSOS
1.77
{|1.74
RSP
1.67
IJ
1.49
Defendant
1.44
––––––––
1.42
boil
1.41
Returns
1.37
ésident
1.34
etted
1.33
Activations Density 0.020%