INDEX
Explanations
detailed descriptions or discussions about various elements or components of a topic
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
687
+0.15
0.5%
1265
+0.13
0.5%
866
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
687
+0.15
0.03
1218
+0.13
0.03
1372
+0.11
0.02
Negative Logits
rimir
-0.52
romero
-0.52
jamón
-0.51
parati
-0.50
Seeder
-0.47
>>;
-0.47
Marín
-0.47
nõ
-0.47
naran
-0.46
Mejía
-0.45
POSITIVE LOGITS
Aspect
1.24
aspect
1.16
aspect
1.14
aspects
1.07
aspects
1.02
Aspect
0.98
Aspects
0.98
Aspects
0.96
facets
0.81
facet
0.80
Activations Density 0.068%