INDEX
Explanations
instances of the word "on"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
312
+0.11
0.6%
396
+0.11
0.6%
421
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
69
+0.11
0.22
152
+0.11
0.17
47
+0.11
0.15
Negative Logits
¾
-2.37
Ŀ
-2.16
ĺ
-2.11
¶
-1.99
Ľ
-1.96
ĵ
-1.95
ĥ½
-1.95
ı
-1.91
Ń
-1.88
ļ
-1.83
POSITIVE LOGITS
sec
1.35
refer
1.33
Song
1.33
monitors
1.28
dec
1.27
EMA
1.26
days
1.25
disambiguation
1.25
charts
1.25
chant
1.24
Activations Density 0.440%