INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.38
1.9%
1253
+0.04
0.2%
381
+0.04
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2
-0.38
0.00
0
-0.04
0.00
1
-0.04
0.00
Negative Logits
ে
-0.96
ко
-0.93
لينك
-0.91
ॉल
-0.90
INVISIBLE
-0.90
الف
-0.90
only
-0.90
Những
-0.90
া
-0.89
يوم
-0.89
POSITIVE LOGITS
<bos>
11.60
encomp
3.98
fuf
3.92
guarante
3.89
squa
3.88
fta
3.87
increa
3.81
accla
3.79
secon
3.78
intersper
3.77
Activations Density 0.000%
No Known Activations
This feature has no known activations.