INDEX
Explanations
mentions of legal and judicial processes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
184
+0.16
0.5%
1553
+0.09
0.3%
198
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.16
0.02
610
+0.09
0.05
1553
+0.09
0.05
Negative Logits
lancia
-0.99
dises
-0.99
volu
-0.83
glan
-0.82
applau
-0.81
umo
-0.80
volunte
-0.80
campa
-0.80
bett
-0.79
accla
-0.79
POSITIVE LOGITS
often
0.92
Often
0.84
often
0.83
Often
0.79
sometimes
0.75
oftentimes
0.70
usually
0.69
sometimes
0.66
Sometimes
0.65
usually
0.64
Activations Density 0.400%