INDEX
Explanations
references to urban environments and contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.29
1.7%
266
+0.12
0.7%
362
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
266
+0.29
0.02
506
+0.12
0.01
505
+0.12
0.01
Negative Logits
chie
-1.59
deal
-1.57
yla
-1.41
thood
-1.39
roma
-1.36
rium
-1.35
rocy
-1.33
annual
-1.32
agles
-1.31
gradual
-1.30
POSITIVE LOGITS
¾
2.01
ł
1.96
µ
1.89
Ł
1.83
¼
1.76
¸
1.73
¢
1.71
¦
1.64
°
1.61
ĸ
1.61
Activations Density 0.007%