INDEX
Explanations
locations and states, specifically highlighting "Kansas" with high activations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
204
+0.14
0.5%
1387
+0.13
0.5%
1637
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
204
+0.14
0.03
648
+0.13
0.03
1637
+0.12
0.03
Negative Logits
alho
-0.56
putes
-0.54
valente
-0.51
Catawiki
-0.50
pegno
-0.49
ferrer
-0.48
assertIn
-0.48
ytick
-0.47
Thiết
-0.47
ceğim
-0.47
POSITIVE LOGITS
Kansas
1.43
Kansas
1.32
KANSAS
1.26
Kans
1.01
Missouri
0.96
KC
0.96
kc
0.90
Topeka
0.90
Missouri
0.87
KU
0.78
Activations Density 0.146%