INDEX
Explanations
references to books and literary titles, along with mentions of gaming titles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1108
+0.11
0.3%
1166
+0.11
0.3%
908
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
908
+0.11
0.04
297
+0.11
0.06
1438
+0.11
0.04
Negative Logits
commandés
-0.87
sappi
-0.85
affez
-0.82
sélectionnés
-0.82
autorisés
-0.82
sergio
-0.80
alberto
-0.80
conç
-0.79
déploy
-0.79
décid
-0.77
POSITIVE LOGITS
produced
0.56
marketed
0.54
qualify
0.52
are
0.49
masterpieces
0.49
destined
0.49
featured
0.48
submitted
0.48
being
0.48
reviewed
0.48
Activations Density 0.698%