INDEX
Explanations
statements emphasizing actions or roles played by individuals within the legal or political systems
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.13
0.4%
1978
+0.10
0.3%
1018
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
29
+0.13
0.04
1018
+0.10
0.03
1107
+0.08
0.03
Negative Logits
unspeak
-1.06
maneu
-1.01
shenan
-1.01
volunte
-0.94
impra
-0.94
ineffec
-0.94
practition
-0.92
encomp
-0.92
disagre
-0.92
miscon
-0.91
POSITIVE LOGITS
<bos>
0.95
him
0.94
us
0.85
me
0.75
them
0.73
him
0.71
setViewportView
0.69
hini
0.68
initComponents
0.66
AssemblyCompany
0.66
Activations Density 0.392%