INDEX
Explanations
references to cyber-related topics and vocabulary
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
43
+0.13
0.8%
445
+0.13
0.7%
317
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
445
+0.13
0.01
177
+0.13
0.02
503
+0.13
0.02
Negative Logits
Caption
-1.82
ailing
-1.68
descent
-1.64
previous
-1.48
other
-1.47
lack
-1.44
mind
-1.44
longest
-1.42
ails
-1.41
izations
-1.40
POSITIVE LOGITS
°
2.20
¤
1.96
ł
1.96
ģ
1.95
Ł
1.94
tochrome
1.94
Ķ
1.93
²
1.93
Ĵ
1.85
Ļ
1.85
Activations Density 0.157%