INDEX
Explanations
No Explanations Found
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
460
+0.12
0.6%
269
+0.11
0.6%
484
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
258
+0.12
0.55
56
+0.11
0.18
203
+0.11
0.38
Negative Logits
¨
-2.40
¢
-2.16
Ľ
-2.14
ī
-2.08
atics
-2.05
Ń
-2.01
¡
-1.99
ħ
-1.92
Ī
-1.87
½
-1.83
POSITIVE LOGITS
leted
1.58
lier
1.52
lichen
1.52
vast
1.49
orbit
1.46
anden
1.43
liest
1.41
suggestive
1.38
liche
1.36
arbitrary
1.36
Activations Density 3.561%
No Known Activations
This feature has no known activations.