INDEX
    Explanations

    The neuron is selectively activated by placeholder‐style entity tokens (e.g. NAME_, AUFIDIUS, etc.) rather than ordinary words.

    New Auto-Interp
    Negative Logits
     raid
    -0.07
    ]")↵
    -0.07
    ;">↵
    -0.07
    려요
    -0.07
    }"↵
    -0.07
    ーの
    -0.07
    ];↵
    -0.07
     pronto
    -0.06
     II
    -0.06
     golf
    -0.06
    POSITIVE LOGITS
    水平
    0.07
    =G
    0.07
    thalm
    0.06
    [selected
    0.06
    edy
    0.06
     Marco
    0.06
    -master
    0.06
     сильно
    0.06
    _sur
    0.06
     onCreateViewHolder
    0.06
    Act Density 0.008%

    No Known Activations