INDEX
Explanations
number and symbol patterns mixed with words, potentially related to product specifications
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.15
0.4%
1445
+0.12
0.3%
2019
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1678
+0.15
0.03
1343
+0.12
0.03
1445
+0.10
0.03
Negative Logits
Kiedy
-0.60
bowiem
-0.56
hornblende
-0.56
bronco
-0.56
chert
-0.54
zirc
-0.53
pelic
-0.53
gaily
-0.52
avanzado
-0.52
Wię
-0.52
POSITIVE LOGITS
nessuna
0.80
nessun
0.76
cristina
0.73
Più
0.73
incess
0.72
parlar
0.72
allarg
0.71
ritard
0.71
apparti
0.70
Queste
0.70
Activations Density 0.083%