INDEX
Explanations
terms related to bandwidth and throughput
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.25
1.5%
376
+0.15
0.9%
115
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
7
+0.25
0.01
156
+0.15
0.01
120
+0.12
0.01
Negative Logits
pleasure
-1.73
git
-1.59
ureus
-1.58
blessing
-1.43
cca
-1.43
Calendar
-1.41
help
-1.40
usepackage
-1.40
whom
-1.39
woke
-1.39
POSITIVE LOGITS
worth
1.97
gap
1.70
wise
1.70
ontal
1.60
block
1.55
neutral
1.55
spectrum
1.53
"}](#
1.48
furt
1.45
worthy
1.43
Activations Density 0.021%