INDEX
    Explanations

    This neuron activates on programming code tokens, i.e. parts of the text containing code examples or code-like syntax.

    New Auto-Interp
    Negative Logits
     Injection
    -0.07
     vitamins
    -0.07
    -0.07
    _band
    -0.07
    われ
    -0.06
    -sex
    -0.06
    цять
    -0.06
     zw
    -0.06
    _DR
    -0.06
    activation
    -0.06
    POSITIVE LOGITS
    "%(
    0.07
     род
    0.07
    .Annotation
    0.06
    (G
    0.06
     yüksek
    0.06
     ayn
    0.06
     خد
    0.06
    0.06
    (sensor
    0.06
     Derek
    0.05
    Act Density 0.003%

    No Known Activations