INDEX
Explanations
phrases related to storytelling or plot development
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.15
0.5%
1265
+0.11
0.4%
1742
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1742
+0.15
0.05
1671
+0.11
0.05
1425
+0.11
0.04
Negative Logits
ibi
-0.78
sembl
-0.77
Milán
-0.76
ria
-0.75
„,
-0.74
meis
-0.74
fta
-0.73
caufe
-0.72
vost
-0.72
cæ
-0.72
POSITIVE LOGITS
now
1.04
Now
1.00
NOW
0.96
Now
0.92
now
0.91
NOW
0.87
ahora
0.72
Ahora
0.71
Ahora
0.65
agora
0.62
Activations Density 0.075%