INDEX
Explanations
phrases related to legal matters and government actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.13
0.4%
892
+0.11
0.4%
1757
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1156
+0.13
0.04
892
+0.11
0.05
1507
+0.09
0.04
Negative Logits
nant
-0.80
istan
-0.78
sii
-0.76
tsi
-0.74
lele
-0.73
soggior
-0.72
cance
-0.70
fides
-0.70
kasa
-0.70
sena
-0.70
POSITIVE LOGITS
which
0.60
whom
0.56
whose
0.54
which
0.52
Apesar
0.51
whom
0.49
Which
0.47
whose
0.47
whatever
0.47
PARTIC
0.46
Activations Density 0.279%