INDEX
    Explanations

    The neuron fires on XML attribute names or identifiers containing the substring “action.”

    New Auto-Interp
    Negative Logits
    -0.07
     chronological
    -0.07
     donn
    -0.07
    Bien
    -0.07
    getModel
    -0.06
     yours
    -0.06
    041
    -0.06
     deg
    -0.06
     خودش
    -0.06
     ندار
    -0.06
    POSITIVE LOGITS
    最佳
    0.07
    aussian
    0.07
    ص
    0.07
     researchers
    0.07
    _TEST
    0.06
     strugg
    0.06
    해보
    0.06
     AWS
    0.06
    efficient
    0.06
     merging
    0.06
    Act Density 0.001%

    No Known Activations