INDEX
Explanations
LaTeX formatting commands related to mathematical notation and styles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
444
+0.11
0.6%
369
+0.11
0.6%
359
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
111
+0.11
0.01
316
+0.11
0.01
18
+0.11
0.01
Negative Logits
»¿
-2.34
Ĭ
-2.16
Ģ
-1.77
ĭ
-1.74
İ
-1.69
ĸ´
-1.68
ı
-1.68
¥
-1.67
ĵ
-1.66
ľ
-1.62
POSITIVE LOGITS
haven
1.76
media
1.71
EXPORT
1.63
cine
1.55
ian
1.53
styles
1.51
hair
1.51
iche
1.50
enum
1.49
bourg
1.48
Activations Density 0.012%