INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
489
+0.14
0.8%
136
+0.12
0.7%
152
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
0
-0.14
0.00
1
-0.12
0.00
2
-0.11
0.00
Negative Logits
ĺ
-2.76
Ļª
-2.71
ĭ
-2.66
ħ
-2.54
¢
-2.54
ĻĤ
-2.49
ĩ
-2.48
Ĵ
-2.42
ĵ
-2.40
ĥ
-2.38
POSITIVE LOGITS
SEA
1.55
oversight
1.52
********************************
1.52
GPL
1.52
elon
1.50
ielder
1.49
lish
1.48
slash
1.44
****************************************************************************
1.43
--"
1.43
Activations Density 0.000%
No Known Activations
This feature has no known activations.