INDEX
Explanations
mentions of specific locations and related events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
31
+0.16
0.6%
1516
+0.13
0.5%
1778
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
31
+0.16
0.03
1516
+0.13
0.03
1778
+0.12
0.03
Negative Logits
Cay
-0.48
chiaramente
-0.46
sembrano
-0.46
Oth
-0.46
silen
-0.44
incarcer
-0.44
Oru
-0.44
Braz
-0.43
Qar
-0.43
makeText
-0.43
POSITIVE LOGITS
Afghanistan
1.20
Afghanistan
1.12
Afghan
1.08
ABUL
1.02
Afghan
0.98
Afghans
0.85
Darío
0.83
afghan
0.83
Afgan
0.80
Kabul
0.76
Activations Density 0.094%