INDEX
Explanations
objective
This neuron never activates on any of the document tokens—it appears to be effectively “dead.”
New Auto-Interp
Negative Logits
SCRIPT
-0.07
orient
-0.07
τολ
-0.07
toll
-0.07
anas
-0.06
wrest
-0.06
low
-0.06
Yep
-0.06
ंट
-0.06
Levels
-0.06
POSITIVE LOGITS
VO
0.07
oid
0.07
kako
0.06
uu
0.06
ivo
0.06
mük
0.06
.bg
0.06
.GroupBox
0.06
=z
0.06
prostor
0.06
Activations Density 0.012%