INDEX
Explanations
punctuation marks that often conclude statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.32
1.7%
382
+0.21
1.1%
1535
+0.17
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.32
0.22
2034
+0.21
0.18
1535
+0.17
0.14
Negative Logits
<bos>
-2.70
ⓧ
-1.06
/***
-0.96
///**
-0.91
/**
-0.89
<?
-0.89
-0.88
springfox
-0.82
/*
-0.79
AssemblyCompany
-0.67
POSITIVE LOGITS
seksi
0.86
soulign
0.85
épu
0.83
lele
0.80
Karakter
0.78
véhic
0.76
évalu
0.76
alté
0.76
déchir
0.75
marea
0.74
Activations Density 1.208%