INDEX
Explanations
references to action figures and popular culture characters
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
678
+0.11
0.3%
736
+0.09
0.3%
1819
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.11
0.04
736
+0.09
0.04
1157
+0.08
0.04
Negative Logits
akut
-0.99
Muhamma
-0.87
maksi
-0.85
kontinu
-0.85
kemer
-0.82
praktik
-0.81
panik
-0.81
territo
-0.80
ekos
-0.80
makro
-0.79
POSITIVE LOGITS
Realistic
0.66
cgi
0.63
époux
0.63
Realistic
0.59
scp
0.57
héri
0.57
Modelling
0.56
Seigneur
0.55
Chinois
0.55
vga
0.55
Activations Density 0.379%