INDEX
Explanations
occurrences related to legal complaints and initiatives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.17
0.5%
1150
+0.12
0.4%
137
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1499
+0.17
0.06
1804
+0.12
0.04
514
+0.11
0.02
Negative Logits
hairc
-2.55
scrat
-2.31
affor
-2.27
swarovski
-2.17
increa
-2.17
ecru
-2.14
indestru
-2.14
embodi
-2.12
suscep
-2.12
cushi
-2.11
POSITIVE LOGITS
<bos>
1.16
copg
0.73
smithy
0.72
gynhyrchwyd
0.72
phosa
0.71
fromnode
0.69
ніципа
0.69
simplifié
0.67
<>",
0.67
DeleteBehavior
0.66
Activations Density 0.280%