INDEX
Explanations
events or actions happening in a sequence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1328
+0.16
0.6%
481
+0.14
0.5%
381
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1328
+0.16
0.07
481
+0.14
0.06
1425
+0.12
0.05
Negative Logits
habile
-0.85
pleins
-0.83
éto
-0.82
joyeux
-0.77
animés
-0.75
délicieux
-0.75
obé
-0.73
malheureux
-0.73
rafraî
-0.73
récents
-0.72
POSITIVE LOGITS
would
0.89
would
0.80
WOULD
0.78
Would
0.71
Would
0.69
wouldn
0.68
eventually
0.66
wouldnt
0.65
soon
0.59
could
0.58
Activations Density 0.207%