INDEX
Explanations
legal terms and issues related to legislation and civil rights
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1135
+0.08
0.2%
147
+0.07
0.2%
569
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
100
+0.08
0.03
1135
+0.07
0.03
561
+0.07
0.01
Negative Logits
impractica
-0.81
unlaw
-0.77
indor
-0.70
disgra
-0.68
adjour
-0.65
disad
-0.64
dilap
-0.62
unve
-0.60
erad
-0.60
indoc
-0.60
POSITIVE LOGITS
privacy
1.01
Privacy
0.76
privacy
0.74
freedoms
0.71
freedom
0.71
rights
0.69
Privacy
0.68
autonomy
0.62
liberties
0.62
confidentiality
0.61
Activations Density 0.275%