INDEX
Explanations
subjects related to comparison, analysis, and evaluation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2019
+0.09
0.3%
624
+0.09
0.2%
257
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
257
+0.09
0.04
1265
+0.09
0.03
1135
+0.07
0.02
Negative Logits
honn
-0.59
Mère
-0.59
chande
-0.59
malheureux
-0.58
cardin
-0.57
sembl
-0.56
pinak
-0.55
kanya
-0.54
debout
-0.54
soigne
-0.54
POSITIVE LOGITS
instead
0.58
focus
0.57
replace
0.56
embrace
0.55
spesies
0.55
replaced
0.54
preferring
0.54
vervangen
0.53
Instead
0.53
concentrate
0.53
Activations Density 0.385%