INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.42
3.9%
1741
+0.06
0.5%
1120
+0.04
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1700
+0.42
0.00
99
+0.06
0.00
72
+0.04
0.00
Negative Logits
ে
-1.06
भी
-1.05
не
-1.03
া
-1.03
ко
-1.03
الق
-1.02
াই
-1.02
про
-1.02
но
-1.02
的话
-1.02
POSITIVE LOGITS
<bos>
12.87
fuf
4.27
encomp
4.25
fta
4.25
guarante
4.24
effe
4.24
squa
4.21
affor
4.18
secon
4.14
desir
4.12
Activations Density 0.000%
No Known Activations
This feature has no known activations.