INDEX
Explanations
references to the word "lion."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1624
+0.15
0.6%
1222
+0.13
0.6%
1515
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1624
+0.15
0.03
629
+0.13
0.02
1363
+0.12
0.02
Negative Logits
departament
-0.54
cso
-0.54
hek
-0.54
ché
-0.53
kras
-0.52
polie
-0.51
palet
-0.51
stok
-0.51
Verk
-0.51
lele
-0.50
POSITIVE LOGITS
Lion
1.45
lion
1.42
Lions
1.37
Lion
1.36
lions
1.33
LION
1.28
Lions
1.23
lion
0.92
lions
0.89
Leo
0.86
Activations Density 0.127%