INDEX
Explanations
references to legal and procedural elements in a context involving authority, decisions, and evaluation of circumstances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
98
+0.13
0.7%
17
+0.13
0.7%
320
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
320
+0.13
0.07
345
+0.13
0.06
186
+0.12
-0.03
Negative Logits
breviations
-1.81
à¯
-1.58
reements
-1.54
tongues
-1.51
SUCH
-1.51
MENTS
-1.50
ashes
-1.46
jections
-1.44
))$.
-1.43
reactions
-1.40
POSITIVE LOGITS
estate
1.68
govern
1.62
naire
1.54
consin
1.51
oku
1.51
kowski
1.50
itself
1.50
andom
1.48
isco
1.47
keit
1.42
Activations Density 2.739%