INDEX
Explanations
actions and decisions related to transactions and personal agency
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.15
0.6%
678
+0.13
0.5%
1177
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.15
0.09
1504
+0.13
0.08
946
+0.11
0.07
Negative Logits
<bos>
-2.59
.
-0.99
AfterClass
-0.91
also
-0.89
↵↵
-0.89
;
-0.89
SizeMode
-0.88
,
-0.88
:
-0.87
—
-0.86
POSITIVE LOGITS
impra
2.74
swarovski
2.72
increa
2.72
aen
2.56
stockholm
2.55
thut
2.55
emphat
2.52
squa
2.52
maneu
2.51
depic
2.50
Activations Density 1.097%