INDEX
Explanations
locations or points in a narrative where actions or key events occur
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
474
+0.09
0.3%
1108
+0.08
0.2%
314
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
474
+0.09
0.03
879
+0.08
0.02
1502
+0.07
0.03
Negative Logits
ceea
-0.42
Nevertheless
-0.42
Cannot
-0.41
</h4>
-0.41
cannot
-0.41
ága
-0.41
ছ
-0.40
by
-0.40
ні
-0.40
Didn
-0.40
POSITIVE LOGITS
dises
0.88
tanga
0.85
endom
0.80
dora
0.79
Minang
0.79
anton
0.78
abnorm
0.76
mef
0.76
huma
0.75
meis
0.74
Activations Density 0.183%