INDEX
Explanations
phrases that mention a capital city or the term "capital."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
71
+0.14
0.8%
417
+0.12
0.7%
224
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
71
+0.14
0.02
417
+0.12
0.02
91
+0.11
0.02
Negative Logits
³
-4.30
ª
-4.24
Ĥ¬
-4.21
ĨĴ
-4.04
»¿
-3.80
§
-3.76
¯
-3.69
Ī
-3.68
¢
-3.66
IJ
-3.66
POSITIVE LOGITS
umes
1.85
illery
1.83
expenditure
1.83
ization
1.80
isations
1.72
izations
1.69
games
1.68
maze
1.68
atures
1.63
markets
1.61
Activations Density 0.118%