INDEX
Explanations
information regarding historical events, particularly focusing on early stages or versions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
897
+0.17
0.6%
776
+0.16
0.5%
757
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
897
+0.17
0.05
1865
+0.16
0.04
757
+0.12
0.04
Negative Logits
Sinopsis
-0.69
decembrie
-0.68
septembrie
-0.62
aprilie
-0.60
Inoltre
-0.60
ianuarie
-0.59
Informações
-0.58
Horário
-0.58
dropna
-0.58
Composição
-0.55
POSITIVE LOGITS
repug
1.09
EARLY
1.09
unwarran
1.06
impractica
1.03
EARLY
1.02
noel
1.02
early
1.00
juven
0.99
reluct
0.98
hentai
0.98
Activations Density 0.116%