INDEX
Explanations
mentions of actions or events happening over time
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
32
+0.14
0.5%
478
+0.13
0.4%
1306
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
32
+0.14
0.09
68
+0.13
0.06
1678
+0.11
0.07
Negative Logits
matel
-1.13
karton
-1.07
budapest
-1.06
kafe
-1.04
silikon
-1.03
torba
-1.02
mikrofon
-1.00
maksi
-0.99
alkoh
-0.99
siena
-0.99
POSITIVE LOGITS
have
0.86
been
0.85
has
0.73
had
0.69
have
0.68
been
0.65
HAVE
0.64
become
0.63
come
0.62
BEEN
0.62
Activations Density 0.344%