INDEX
Explanations
information related to incidents or accidents
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.5%
1577
+0.11
0.3%
1013
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
856
+0.17
0.10
227
+0.11
0.17
50
+0.10
0.18
Negative Logits
bandung
-0.77
habet
-0.77
indeb
-0.73
potest
-0.71
haup
-0.70
bombe
-0.69
jaja
-0.69
dises
-0.69
dora
-0.67
eorum
-0.66
POSITIVE LOGITS
Bartholo
0.68
McLaugh
0.68
stretchr
0.66
Bengt
0.62
XYZ
0.62
Defective
0.60
Rodrig
0.59
Punct
0.59
Vaugh
0.59
Gorb
0.58
Activations Density 5.217%