INDEX
Explanations
mention of different programs, agreements, community benefits, neighborhood initiatives, and infrastructure projects
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
161
+0.19
1.0%
889
+0.17
0.9%
812
+0.16
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
889
+0.19
0.05
783
+0.17
0.04
161
+0.16
0.04
Negative Logits
shenan
-1.10
hairc
-1.01
hoody
-0.97
milf
-0.96
wikihow
-0.93
hentai
-0.91
lmfao
-0.91
Lmao
-0.90
increa
-0.88
disreg
-0.88
POSITIVE LOGITS
designer
0.92
Mos
0.89
designer
0.88
Designer
0.86
Mos
0.82
Designer
0.81
warehouse
0.70
designers
0.70
canvas
0.69
kos
0.67
Activations Density 0.746%