INDEX
Explanations
references to Marvel characters or merchandise
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
486
+0.10
0.3%
47
+0.09
0.3%
971
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
690
+0.10
0.03
1793
+0.09
0.02
570
+0.08
0.02
Negative Logits
-1.34
ⓧ
-1.27
<bos>
-1.25
/**
-1.22
quitted
-1.09
<?
-1.08
pegasus
-0.89
intersper
-0.87
shenan
-0.86
gaily
-0.84
POSITIVE LOGITS
Marvel
1.97
Marvel
1.82
Spider
1.48
Spider
1.40
marvel
1.28
spider
1.18
Avengers
1.07
spider
1.06
marvel
1.05
Avengers
1.02
Activations Density 0.159%