INDEX
Explanations
technical code snippets or computer-generated visual elements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.18
0.5%
1343
+0.17
0.5%
674
+0.12
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.18
0.04
24
+0.17
0.04
1527
+0.12
0.03
Negative Logits
arrête
-0.60
only
-0.57
belated
-0.57
catalogo
-0.57
at
-0.57
yet
-0.56
«
-0.56
"
-0.56
defiance
-0.56
‘
-0.55
POSITIVE LOGITS
sappi
1.62
milano
1.54
napoli
1.43
Ottobre
1.40
dises
1.40
bandung
1.40
parlar
1.38
affez
1.35
lele
1.34
auguri
1.34
Activations Density 0.114%