INDEX
Explanations
terms related to reviewing and determining the cause of incidents, as well as preventing them from recurring
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
764
+0.09
0.2%
674
+0.08
0.2%
1378
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1000
+0.09
0.01
392
+0.08
0.02
1030
+0.06
0.02
Negative Logits
disreg
-1.06
impra
-0.99
maneu
-0.92
suscep
-0.88
embodi
-0.88
inconce
-0.87
increa
-0.86
shenan
-0.85
uninten
-0.85
intermitt
-0.85
POSITIVE LOGITS
future
0.80
future
0.73
repeat
0.63
prevention
0.62
repeat
0.61
Future
0.60
Future
0.56
Repeat
0.56
futuro
0.56
prevent
0.55
Activations Density 0.221%