INDEX
    Explanations

    The neuron activates on placeholder model or project identifiers of the form “NAME_1,” i.e. tokens that form that placeholder name.

    New Auto-Interp
    Negative Logits
     flawless
    -0.06
    ประเภท
    -0.06
     Hindered
    -0.06
     hatır
    -0.06
     Airport
    -0.06
     NotImplementedError
    -0.06
    _fence
    -0.06
    StatusCode
    -0.06
    ække
    -0.06
    -0.06
    POSITIVE LOGITS
     stab
    0.07
     Fancy
    0.07
     Alicia
    0.06
     ASA
    0.06
     fora
    0.06
     mia
    0.06
    wifi
    0.06
    ню
    0.06
     freopen
    0.06
    psz
    0.06
    Act Density 0.049%

    No Known Activations