INDEX
Explanations
words related to government, regulations, and official actions in a city or province
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1919
+0.14
0.4%
1499
+0.11
0.3%
1870
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.14
0.09
1499
+0.11
0.08
1048
+0.09
0.03
Negative Logits
nutr
-1.33
deleter
-1.28
angelo
-1.27
fluo
-1.26
contex
-1.25
intersper
-1.25
cannes
-1.24
napoli
-1.23
milano
-1.21
hcm
-1.21
POSITIVE LOGITS
chose
0.74
يكب
0.73
expects
0.72
decided
0.71
wants
0.71
has
0.70
powinno
0.70
didn
0.69
took
0.69
cannot
0.69
Activations Density 0.444%