INDEX
Explanations
instances of numbers and technical terms related to a structured text or study
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.19
0.6%
1870
+0.14
0.4%
2034
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.19
0.04
1510
+0.14
0.03
1274
+0.11
0.03
Negative Logits
sappi
-1.32
scopri
-1.06
migli
-0.94
Augu
-0.92
PLW
-0.91
abbra
-0.91
rispond
-0.91
dichi
-0.90
affez
-0.90
dora
-0.90
POSITIVE LOGITS
and
0.82
or
0.76
и
0.70
và
0.69
以及
0.67
или
0.66
bibnamefont
0.64
versus
0.63
和
0.63
และ
0.62
Activations Density 0.216%