INDEX
Explanations
instances where numbers are being compared or ranked
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.24
0.8%
776
+0.19
0.6%
321
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
321
+0.24
0.08
776
+0.19
0.06
755
+0.13
0.05
Negative Logits
unspeak
-1.61
apprehen
-1.44
intersper
-1.41
shenan
-1.38
gaily
-1.33
indescri
-1.32
endeavouring
-1.32
luxuriant
-1.31
vainly
-1.30
reluct
-1.29
POSITIVE LOGITS
silikon
1.17
alkoh
1.17
kosme
1.17
kompati
1.09
kafe
1.08
kön
1.05
kontinu
1.05
praktik
1.03
antik
1.03
maksi
1.02
Activations Density 0.102%