INDEX
Explanations
statistical analysis and results reporting
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1577
+0.24
0.9%
1343
+0.23
0.9%
599
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1577
+0.24
0.11
1343
+0.23
0.11
845
+0.08
0.05
Negative Logits
<bos>
-3.13
/*
-0.87
/**
-0.78
<?
-0.71
#!/
-0.68
but
-0.68
Walkover
-0.67
#
-0.67
either
-0.66
CreateIndex
-0.66
POSITIVE LOGITS
increa
1.94
affor
1.91
maneu
1.90
emphat
1.90
suscep
1.87
impra
1.87
resear
1.86
?...
1.84
unlaw
1.84
unden
1.80
Activations Density 0.948%