INDEX
Explanations
specific criteria listed in documents
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.15
0.8%
757
+0.10
0.5%
200
+0.09
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
200
+0.15
0.02
437
+0.10
0.02
1133
+0.09
0.01
Negative Logits
<bos>
-3.03
/***
-0.83
ⓧ
-0.78
-0.71
<?
-0.69
/**
-0.61
/*!
-0.59
reclaim
-0.59
/*++
-0.59
modernize
-0.57
POSITIVE LOGITS
criteria
1.31
Criteria
1.26
criteria
1.21
Criteria
1.19
criterion
1.16
jawa
1.15
CRITERIA
1.09
Criterion
1.07
jaya
1.02
riteria
1.00
Activations Density 0.106%