INDEX
Explanations
phrases related to legal terminology and concepts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
98
+0.15
0.8%
282
+0.13
0.7%
478
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
457
+0.15
0.09
404
+0.13
0.10
407
+0.13
0.08
Negative Logits
ian
-1.52
berg
-1.42
ring
-1.42
clashes
-1.40
wide
-1.40
hold
-1.36
pic
-1.34
=.
-1.33
ruptcy
-1.33
dyst
-1.31
POSITIVE LOGITS
MOESM
2.26
thems
1.81
FPar
1.71
each
1.59
supplementary
1.58
ourselves
1.58
fered
1.58
behalf
1.56
---|---|---
1.55
Fig
1.53
Activations Density 0.647%