INDEX
Explanations
phrases related to legal offenses and penalties
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1553
+0.10
0.3%
453
+0.09
0.2%
1648
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
718
+0.10
0.03
1553
+0.09
0.04
417
+0.07
0.03
Negative Logits
Anm
-0.71
fign
-0.67
prodi
-0.67
idyl
-0.66
thut
-0.66
Forder
-0.66
pessi
-0.66
haer
-0.65
„,
-0.64
astéro
-0.64
POSITIVE LOGITS
penalties
0.62
alties
0.56
penalty
0.52
fines
0.48
punishable
0.48
imprisonment
0.48
Penalties
0.47
offerts
0.46
felony
0.46
sightly
0.45
Activations Density 0.134%