INDEX
Explanations
descriptions of step-by-step actions or procedures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
776
+0.13
0.4%
381
+0.10
0.3%
331
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
331
+0.13
0.07
1491
+0.10
0.06
421
+0.10
0.05
Negative Logits
churrasco
-0.68
churras
-0.68
°;
-0.66
Cæsar
-0.65
Darío
-0.65
cytok
-0.64
“…”
-0.63
tortas
-0.62
myn
-0.62
Mónica
-0.62
POSITIVE LOGITS
There
0.75
There
0.74
there
0.73
there
0.71
THERE
0.69
THERE
0.68
earcher
0.53
exists
0.51
hasn
0.51
كومونز
0.51
Activations Density 0.155%