INDEX
Explanations
This neuron never activates on any tokens—it’s effectively a dead (unused) neuron that doesn’t detect any feature.
New Auto-Interp
Negative Logits
contributions
-0.08
áhnout
-0.07
reporting
-0.07
thirteen
-0.06
transition
-0.06
contribution
-0.06
ост
-0.06
optimal
-0.06
Trem
-0.06
_post
-0.06
POSITIVE LOGITS
¾
0.08
½
0.08
�
0.08
ICI
0.07
half
0.07
ัดการ
0.07
beberapa
0.07
ctica
0.06
/her
0.06
discriminate
0.06
Activations Density 0.011%