INDEX
Explanations
patterns resembling ASCII art
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.24
0.9%
50
+0.22
0.8%
1577
+0.16
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
184
+0.24
0.03
1343
+0.22
0.09
1009
+0.16
0.04
Negative Logits
comprim
-0.55
bezeichneter
-0.54
<bos>
-0.54
ladri
-0.51
khó
-0.47
تقاوى
-0.47
parten
-0.46
biling
-0.46
/*
-0.45
solidar
-0.45
POSITIVE LOGITS
shewn
0.59
endeavouring
0.57
:,,
0.57
ftu
0.56
PLWABN
0.56
embodi
0.55
vns
0.55
apprehen
0.54
fign
0.53
poff
0.53
Activations Density 1.331%