INDEX
Explanations
names of specific locations, particularly related to Los Angeles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
486
+0.16
0.6%
1515
+0.15
0.6%
1133
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1178
+0.16
0.04
486
+0.15
0.03
1515
+0.13
0.03
Negative Logits
Może
-0.78
Sklici
-0.75
Bardzo
-0.70
Dziękuję
-0.67
🕗
-0.64
Pozdrawiam
-0.62
Dzięki
-0.62
Wię
-0.61
Dlaczego
-0.61
piña
-0.61
POSITIVE LOGITS
Angeles
1.20
ANGELES
0.97
Los
0.86
LA
0.82
California
0.81
angeles
0.74
Kalifor
0.73
California
0.72
LA
0.72
Los
0.71
Activations Density 0.062%