INDEX
Explanations
sequences of letters and numbers in a structured way
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.21
0.7%
1343
+0.20
0.6%
1177
+0.18
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.21
0.05
1343
+0.20
0.05
71
+0.18
0.03
Negative Logits
occupe
-0.74
Secara
-0.73
aimerais
-0.73
Meskipun
-0.70
splitContainer
-0.70
arrête
-0.69
álbum
-0.69
Estou
-0.69
Setiap
-0.68
Selama
-0.66
POSITIVE LOGITS
embodi
1.28
overla
1.27
meis
1.26
uhr
1.26
parati
1.25
wien
1.24
fluo
1.22
levis
1.20
erec
1.19
inder
1.19
Activations Density 0.276%