INDEX
Explanations
specific mentions of numerical values in a structured format
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.12
0.4%
2000
+0.10
0.3%
438
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2000
+0.12
0.05
1398
+0.10
0.04
1047
+0.09
0.05
Negative Logits
InkWell
-0.73
marginVertical
-0.70
ⓧ
-0.66
marginHorizontal
-0.66
URBANA
-0.56
Δείτε
-0.55
Πηγή
-0.55
তথ্যসূত্র
-0.54
Caratter
-0.54
paddingVertical
-0.54
POSITIVE LOGITS
autrefois
0.61
these
0.59
heti
0.56
them
0.56
quelquefois
0.56
gaily
0.55
saad
0.54
tolerably
0.54
……"
0.53
soudain
0.53
Activations Density 0.159%