INDEX
Explanations
numerical patterns and sequences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
369
+0.19
1.0%
396
+0.14
0.8%
292
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
307
+0.19
0.04
82
+0.14
0.03
30
+0.14
0.03
Negative Logits
ever
-1.75
envy
-1.62
lectual
-1.56
atement
-1.55
unity
-1.50
warfare
-1.50
interference
-1.47
infinity
-1.43
rich
-1.40
repetition
-1.39
POSITIVE LOGITS
ĥ½
2.17
ĺ
2.15
³
2.11
į
2.04
Ľ
1.96
Ł
1.73
£
1.70
ĻĤ
1.68
00
1.66
ł
1.64
Activations Density 0.185%