INDEX
    Explanations

    The neuron selectively activates on the Spanish feminine definite article “la.”

    New Auto-Interp
    Negative Logits
    _iteration
    -0.07
    Donald
    -0.07
     ith
    -0.07
    -0.07
     zase
    -0.07
    Pull
    -0.06
    failure
    -0.06
    decode
    -0.06
    _dst
    -0.06
    Either
    -0.06
    POSITIVE LOGITS
     viewType
    0.08
    CREEN
    0.07
    افية
    0.06
     유저
    0.06
    ./
    0.06
    /page
    0.06
    стров
    0.06
    скую
    0.06
    سين
    0.06
    شة
    0.06
    Act Density 0.125%

    No Known Activations