INDEX
Explanations
sentences related to legal and political systems, with a possible focus on reform and discrimination
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
604
+0.10
0.3%
872
+0.10
0.3%
1553
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
273
+0.10
0.03
1380
+0.10
0.01
1553
+0.07
0.05
Negative Logits
<bos>
-0.85
cytoplas
-0.79
repug
-0.68
Bartholo
-0.66
ingrat
-0.66
déterminé
-0.64
Backman
-0.63
McLaugh
-0.63
Juf
-0.62
unspeak
-0.62
POSITIVE LOGITS
outdated
0.66
inadequate
0.56
needs
0.53
inefficient
0.53
obsolete
0.51
flaws
0.50
flawed
0.47
malfunction
0.47
ineffective
0.47
system
0.47
Activations Density 0.529%