INDEX
Explanations
references to pre-study or preliminary stages in research contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
46
+0.13
0.7%
218
+0.12
0.7%
156
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
386
+0.13
0.02
46
+0.12
0.02
374
+0.12
0.02
Negative Logits
Ĵ
-2.70
ľĵ
-2.63
Į
-2.45
ĻĤ
-2.43
£
-2.39
°
-2.34
ĸ´
-2.34
İ
-2.34
Ķ
-2.27
ĥ
-2.27
POSITIVE LOGITS
fecture
1.96
historic
1.70
ppers
1.57
zzo
1.54
ceeding
1.53
awarded
1.52
clamation
1.51
-
1.51
iser
1.48
shipment
1.45
Activations Density 0.052%