INDEX
    Explanations

    multiple choice options

    The neuron fires on the appearance of the answer choice “C” token.

    New Auto-Interp
    Negative Logits
    ,get
    -0.07
    сю
    -0.07
    ,就是
    -0.07
     migrate
    -0.06
     described
    -0.06
    sse
    -0.06
    __(/*!
    -0.06
    IED
    -0.06
     pued
    -0.06
     eighth
    -0.06
    POSITIVE LOGITS
     Venom
    0.07
    ouses
    0.06
     gauss
    0.06
    uang
    0.06
     nodeList
    0.06
    ектив
    0.06
    ฤด
    0.06
    commons
    0.06
    aleza
    0.06
    target
    0.06
    Act Density 0.002%

    No Known Activations