INDEX
Explanations
mentions of specific numerical values in contexts related to technology, tutorials or instructions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1961
+0.13
0.4%
1334
+0.13
0.4%
1896
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1334
+0.13
0.04
1961
+0.13
0.04
545
+0.12
0.03
Negative Logits
móg
-0.77
;;)
-0.69
plis
-0.64
!...
-0.63
uvres
-0.61
esss
-0.60
wong
-0.59
OfClass
-0.59
:))
-0.59
:,,
-0.58
POSITIVE LOGITS
Mə
0.71
árbol
0.63
Misión
0.60
jetzt
0.59
Ilustra
0.58
Çok
0.58
símbolo
0.57
momento
0.56
mbggenerated
0.56
Haben
0.55
Activations Density 0.155%