INDEX
Explanations
number sequences or ordinal lists within the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1177
+0.12
0.4%
1013
+0.11
0.3%
1870
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1467
+0.12
0.05
2006
+0.11
0.06
1745
+0.10
0.06
Negative Logits
adal
-1.01
utop
-0.98
gesta
-0.91
franz
-0.87
anse
-0.85
erd
-0.83
meis
-0.83
zyn
-0.83
ché
-0.83
vola
-0.81
POSITIVE LOGITS
unlaw
1.05
McLaugh
0.94
pamph
0.90
unwarran
0.87
philanth
0.86
unspeak
0.81
disagre
0.81
affor
0.79
Dijo
0.77
Qualquer
0.77
Activations Density 0.431%