INDEX
Explanations
instances of the word "seen"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.32
1.4%
1691
+0.14
0.6%
1491
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1691
+0.32
0.04
1491
+0.14
0.04
1127
+0.12
0.03
Negative Logits
<bos>
-2.15
Maryland
-0.52
Enllaços
-0.51
sol
-0.50
tons
-0.50
Tony
-0.49
Mary
-0.49
낼
-0.48
Nieder
-0.47
Iné
-0.46
POSITIVE LOGITS
chrysler
1.08
kapag
1.07
Seen
1.04
Seen
1.04
seen
1.03
nutella
1.03
tucson
1.01
oreo
0.99
errone
0.96
mcdonald
0.96
Activations Density 0.104%