INDEX
Explanations
terms related to X-ray technology and vulnerabilities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
904
+0.07
0.2%
1810
+0.06
0.2%
1232
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
283
+0.07
0.03
1343
+0.06
0.06
1884
+0.06
0.04
Negative Logits
cu
-0.91
an
-0.83
must
-0.83
I
-0.83
bar
-0.82
over
-0.82
won
-0.82
min
-0.81
so
-0.81
*
-0.80
POSITIVE LOGITS
fta
3.07
ftu
3.01
increa
2.89
thut
2.83
squa
2.81
aen
2.78
fup
2.76
ftre
2.74
wherea
2.73
maneu
2.69
Activations Density 0.366%