INDEX
Explanations
words related to legal proceedings such as court cases and charges
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.27
0.9%
394
+0.12
0.4%
304
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.27
0.08
1343
+0.12
0.07
227
+0.11
0.06
Negative Logits
-0.88
in
-0.85
no
-0.81
.
-0.81
a
-0.81
to
-0.81
,
-0.80
e
-0.79
de
-0.79
non
-0.79
POSITIVE LOGITS
alkoh
2.25
kask
2.07
karton
2.06
milano
2.02
marte
2.02
cannes
2.01
silikon
1.99
kosme
1.98
drap
1.97
moza
1.96
Activations Density 0.229%