INDEX
Explanations
historical and mythological references
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1842
+0.17
0.5%
184
+0.14
0.4%
736
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.17
0.05
509
+0.14
0.05
293
+0.11
0.04
Negative Logits
broder
-0.83
pét
-0.80
dè
-0.77
marte
-0.76
lapto
-0.75
torba
-0.75
balon
-0.74
patin
-0.74
cori
-0.74
traktor
-0.74
POSITIVE LOGITS
McLaugh
1.01
McInt
0.96
despotism
0.93
Rodrig
0.83
Vaugh
0.81
prerog
0.76
Bartholo
0.76
belliger
0.73
Daven
0.73
unlaw
0.72
Activations Density 0.321%