INDEX
Explanations
years expressed in text form
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.15
0.4%
559
+0.13
0.4%
776
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
776
+0.15
0.06
321
+0.13
0.05
787
+0.09
0.04
Negative Logits
oleo
-1.26
casio
-1.25
michelin
-1.25
snoopy
-1.17
tupperware
-1.13
affor
-1.13
bougie
-1.12
vhs
-1.08
exorbit
-1.08
embodi
-1.08
POSITIVE LOGITS
bonucle
0.62
/*
0.62
UVWXYZ
0.57
]=="
0.54
obacz
0.53
snippetHide
0.52
]>=
0.52
rscheinlichkeit
0.52
цуз
0.51
coroutines
0.51
Activations Density 0.135%