INDEX
Explanations
names of historical figures, locations, and organizations, especially related to politics and civil rights
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.15
0.4%
872
+0.08
0.3%
509
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.15
0.09
1957
+0.08
0.03
1291
+0.08
0.06
Negative Logits
kask
-1.20
kram
-1.19
karton
-1.19
stoff
-1.17
traktor
-1.16
hek
-1.16
plak
-1.12
elek
-1.11
kade
-1.10
palet
-1.09
POSITIVE LOGITS
Áng
0.76
Haci
0.74
Mónica
0.73
Darío
0.72
Justo
0.69
ⓧ
0.69
Whence
0.69
Mejía
0.66
soumettre
0.64
Valentín
0.63
Activations Density 0.933%