INDEX
Explanations
The neuron activates on placeholder tokens like “NAME_1,” “NAME_2,” etc., i.e. it detects when a generic NAME_x entity is mentioned.
New Auto-Interp
Negative Logits
ips
-0.06
diy
-0.06
ERICAN
-0.06
Türk
-0.06
.definition
-0.06
IVE
-0.06
phon
-0.06
.red
-0.06
_indent
-0.06
ive
-0.06
POSITIVE LOGITS
volatility
0.06
enabled
0.06
�
0.06
(Tree
0.06
ibilidad
0.06
breaker
0.06
adr
0.06
Serena
0.06
izzly
0.06
_der
0.06
Activations Density 0.143%