INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.43
3.2%
1741
+0.07
0.5%
1120
+0.03
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.43
0.00
0
-0.07
0.00
1
-0.03
0.00
Negative Logits
ে
-1.03
भी
-1.03
не
-1.02
про
-1.01
но
-1.00
া
-1.00
ার
-1.00
لينك
-1.00
например
-0.99
而且
-0.99
POSITIVE LOGITS
<bos>
12.33
fuf
4.20
effe
4.19
fta
4.15
encomp
4.14
squa
4.13
guarante
4.12
secon
4.10
affor
4.08
desir
4.08
Activations Density 0.000%
No Known Activations
This feature has no known activations.