INDEX
    Explanations

    The neuron activates on tokens conveying read‐only or immutable concepts (e.g. “read-only,” “readonly,” etc.).

    New Auto-Interp
    Negative Logits
     speaks
    -0.07
     weit
    -0.06
    shadow
    -0.06
     shadow
    -0.06
     besser
    -0.06
    _trap
    -0.06
     Junk
    -0.06
     notation
    -0.06
     running
    -0.06
    yle
    -0.06
    POSITIVE LOGITS
    _DETAIL
    0.07
     bedrooms
    0.07
    _Query
    0.07
    ...
    ↵
    0.06
     CentOS
    0.06
     sécur
    0.06
     escal
    0.06
     Yosemite
    0.06
     circuits
    0.06
    สร
    0.06
    Act Density 0.087%

    No Known Activations