INDEX
Explanations
phrases related to specific cities or locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1023
+0.15
0.6%
196
+0.11
0.4%
1506
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
438
+0.15
0.04
1023
+0.11
0.03
648
+0.11
0.03
Negative Logits
PeEnEo
-0.66
GARET
-0.60
nutella
-0.59
emel
-0.56
esternos
-0.55
strona
-0.54
letti
-0.53
preghi
-0.53
htbp
-0.53
িখ
-0.52
POSITIVE LOGITS
Boston
1.42
Boston
1.34
Massachusetts
1.18
boston
1.17
BOSTON
1.15
Massachusetts
1.06
BOSTON
1.03
boston
0.99
Marathi
0.89
Bost
0.89
Activations Density 0.172%