INDEX
Explanations
technical or coding terminology related to programming, state management, and data structures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.17
1.0%
159
+0.14
0.8%
288
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
325
+0.17
0.12
213
+0.14
0.08
288
+0.13
0.13
Negative Logits
scoop
-1.60
ovo
-1.50
cookies
-1.43
pitcher
-1.39
maid
-1.38
glasses
-1.38
cassette
-1.29
flav
-1.26
MgCl
-1.25
acious
-1.25
POSITIVE LOGITS
deaths
1.56
ales
1.51
onse
1.51
lands
1.50
ACP
1.47
own
1.38
Īĺ
1.38
ir
1.37
onica
1.36
umni
1.35
Activations Density 4.086%