INDEX
Explanations
code snippets featuring variable assignments and function definitions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.18
0.6%
876
+0.15
0.4%
1871
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
453
+0.18
0.03
876
+0.15
0.00
1871
+0.14
0.03
Negative Logits
unlaw
-0.98
adjour
-0.85
McLaugh
-0.85
spokespersons
-0.82
assailed
-0.82
endeavouring
-0.78
vainly
-0.78
shewn
-0.75
roused
-0.75
laboured
-0.75
POSITIVE LOGITS
cannes
1.51
espé
1.48
aquare
1.37
tén
1.31
pép
1.31
veau
1.30
marte
1.30
pét
1.29
franz
1.24
haup
1.22
Activations Density 0.072%