INDEX
Explanations
specific financial or economic terms related to monetary values
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
286
+0.15
0.9%
410
+0.13
0.8%
125
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
286
+0.15
0.01
468
+0.13
0.01
293
+0.11
0.01
Negative Logits
abbit
-1.67
aling
-1.56
aning
-1.53
thing
-1.47
ights
-1.42
phants
-1.42
phant
-1.41
?"
-1.40
studying
-1.39
haps
-1.39
POSITIVE LOGITS
etts
1.49
hered
1.37
RP
1.36
etto
1.35
ucc
1.35
Buc
1.32
RU
1.28
seud
1.27
ulators
1.27
nut
1.27
Activations Density 0.009%