INDEX
Explanations
terms related to electrical components and systems
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.23
1.3%
201
+0.12
0.7%
213
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
304
+0.23
0.02
113
+0.12
0.02
208
+0.10
0.02
Negative Logits
apologize
-1.92
ês
-1.76
intend
-1.73
vez
-1.72
chitz
-1.71
wegian
-1.65
ionale
-1.59
hers
-1.54
ocre
-1.54
ego
-1.53
POSITIVE LOGITS
¬
3.40
£
3.12
Ł
3.04
¦
2.98
ĸ
2.89
Ĵ
2.88
¡
2.85
ŀ
2.81
ķ
2.75
Ħ
2.74
Activations Density 0.092%