INDEX
Explanations
information related to legal or police-related incidents
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
381
+0.15
0.5%
1499
+0.14
0.4%
1510
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.15
0.07
1510
+0.14
0.05
521
+0.12
0.04
Negative Logits
bordeaux
-1.52
matel
-1.51
tricot
-1.46
cannes
-1.45
swarovski
-1.42
murano
-1.41
broderie
-1.33
soigne
-1.30
chèvre
-1.28
marseille
-1.28
POSITIVE LOGITS
said
0.87
noted
0.76
explained
0.74
stated
0.74
says
0.73
said
0.73
He
0.72
also
0.70
BeginInit
0.70
urged
0.70
Activations Density 0.244%