INDEX
Explanations
mentions of zoos and zoo-related activities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
554
+0.13
0.5%
597
+0.13
0.5%
703
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1437
+0.13
0.02
597
+0.13
0.01
390
+0.13
0.02
Negative Logits
tsu
-0.61
uj
-0.55
Vb
-0.55
fei
-0.55
nant
-0.54
Hc
-0.54
dora
-0.54
chong
-0.54
««
-0.53
umo
-0.52
POSITIVE LOGITS
zoo
1.16
Zoo
1.13
Zoo
1.11
zoo
1.05
Zo
0.94
zoos
0.93
Zo
0.89
zo
0.85
zoom
0.76
zo
0.74
Activations Density 0.094%