INDEX
Explanations
mentions of various historical figures, events, and novels
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
198
+0.16
0.6%
1265
+0.16
0.6%
281
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.16
0.04
281
+0.16
0.03
1265
+0.13
0.03
Negative Logits
leçons
-0.67
négociations
-0.60
investissements
-0.59
émissions
-0.58
prochaines
-0.57
laim
-0.56
alnız
-0.56
inconvénients
-0.55
réunions
-0.55
débats
-0.55
POSITIVE LOGITS
fta
1.29
ftu
1.22
vns
1.21
poff
1.18
fep
1.11
fup
1.10
ftre
1.08
reft
1.06
miu
1.05
fte
1.05
Activations Density 0.075%