INDEX
Explanations
words related to historical events and societal issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1577
+0.32
1.2%
1013
+0.13
0.5%
50
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1499
+0.32
0.20
1038
+0.13
0.10
297
+0.13
0.14
Negative Logits
kram
-1.09
utop
-1.07
gesta
-1.03
hek
-1.02
solidar
-1.01
bont
-1.01
lapto
-0.99
meis
-0.99
moza
-0.97
tyn
-0.97
POSITIVE LOGITS
öyle
0.61
Bárbara
0.56
Quien
0.55
Méndez
0.54
Tienen
0.53
Junto
0.53
Valentín
0.53
Cárdenas
0.53
Mejía
0.52
Publicado
0.52
Activations Density 9.843%