INDEX
Explanations
instances of legal terminology and references to court cases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
271
+0.15
0.9%
442
+0.14
0.8%
263
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
271
+0.15
-0.01
431
+0.14
0.06
207
+0.12
0.06
Negative Logits
:`
-1.79
(),
-1.54
gage
-1.52
ieurs
-1.38
issions
-1.37
capacities
-1.35
reptococcus
-1.32
({-1.32
ÃŃv
-1.31
clamation
-1.30
POSITIVE LOGITS
ĻĤ
1.52
anytime
1.45
east
1.41
thing
1.40
monks
1.27
)\].
1.26
knights
1.25
acle
1.24
dep
1.19
instein
1.19
Activations Density 2.637%