INDEX
Explanations
references to legal terminology and regulations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
369
+0.15
0.8%
156
+0.14
0.7%
233
+0.12
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
429
+0.15
0.02
215
+0.14
0.03
360
+0.12
0.02
Negative Logits
·¸
-2.19
TRODUCTION
-1.76
ção
-1.75
izumab
-1.72
ffect
-1.72
cology
-1.66
↵
-1.64
↵
-1.64
<|outofrange|>
-1.64
<|outofrange|>
-1.64
POSITIVE LOGITS
derr
1.84
up
1.74
ups
1.63
ings
1.61
oke
1.60
oked
1.59
encil
1.54
uart
1.52
ellate
1.51
ede
1.46
Activations Density 0.056%