INDEX
Explanations
mentions of social issues, government initiatives, and city development
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
764
+0.16
0.5%
1870
+0.12
0.4%
227
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
764
+0.16
0.07
284
+0.12
0.08
875
+0.11
0.04
Negative Logits
sappi
-1.67
affez
-1.51
dises
-1.50
?...
-1.47
thut
-1.42
fep
-1.41
vogli
-1.40
parteci
-1.39
!...
-1.38
ftu
-1.37
POSITIVE LOGITS
through
0.90
by
0.81
via
0.81
throughout
0.75
while
0.74
in
0.72
across
0.71
during
0.70
amid
0.69
.
0.69
Activations Density 0.851%