INDEX
Explanations
keywords related to the justice system
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
501
+0.18
0.7%
1406
+0.13
0.5%
687
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
501
+0.18
0.04
1548
+0.13
0.03
144
+0.13
0.03
Negative Logits
gabri
-0.69
teresa
-0.69
claudia
-0.69
intersper
-0.68
sergio
-0.66
rodriguez
-0.63
andrea
-0.63
encomp
-0.60
paula
-0.60
javier
-0.59
POSITIVE LOGITS
justice
1.42
Justice
1.40
Justice
1.35
JUSTICE
1.31
justice
1.25
justices
0.98
Justices
0.93
Justicia
0.92
justicia
0.84
injustice
0.80
Activations Density 0.078%