INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.22
0.8%
1870
+0.09
0.3%
1253
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.22
0.00
0
-0.09
0.00
1
-0.08
0.00
Negative Logits
we
-1.31
so
-1.25
could
-1.24
was
-1.24
can
-1.24
is
-1.24
they
-1.23
are
-1.22
it
-1.21
in
-1.20
POSITIVE LOGITS
<bos>
8.87
ftu
2.36
autunno
2.29
ftre
2.25
appunt
2.21
sappi
2.21
fta
2.20
fatis
2.18
poft
2.13
affez
2.12
Activations Density 0.000%
No Known Activations
This feature has no known activations.