INDEX
Explanations
phrases emphasizing financial details
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
48
+0.12
0.7%
460
+0.11
0.6%
355
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
98
+0.12
0.17
23
+0.11
0.34
321
+0.11
0.21
Negative Logits
ippy
-1.62
bounds
-1.51
atives
-1.45
obb
-1.42
Amsterdam
-1.39
vre
-1.38
que
-1.38
^-
-1.37
ibilities
-1.35
Barcelona
-1.34
POSITIVE LOGITS
!”
1.76
¦
1.76
»
1.75
ĩ
1.68
Ľ
1.62
¼
1.61
±
1.60
unreadable
1.59
sis
1.58
¾
1.49
Activations Density 4.829%