INDEX
Explanations
connections between various films and studios, particularly focusing on comparisons and successes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
964
+0.15
0.6%
612
+0.12
0.5%
1804
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.15
0.01
1804
+0.12
0.04
612
+0.09
-0.01
Negative Logits
.
-0.48
<eos>
-0.46
↵↵
-0.38
..
-0.37
...
-0.36
Thunk
-0.35
....
-0.35
imprimée
-0.35
.\
-0.33
exceptionnelle
-0.33
POSITIVE LOGITS
burberry
0.76
😭😭
0.74
nutella
0.71
dises
0.69
blackpink
0.69
lidl
0.69
haup
0.69
Ikr
0.68
frankfurt
0.67
stockholm
0.67
Activations Density 0.508%