INDEX
    Explanations

    The neuron strongly responds to the start-of-text token, i.e., the beginning of a sequence.

    New Auto-Interp
    Negative Logits
     Gone
    -0.08
    -0.08
     مرح
    -0.08
     Wanted
    -0.08
     bumps
    -0.08
    -0.08
     cornerstone
    -0.08
    /rem
    -0.08
     births
    -0.08
    -0.07
    POSITIVE LOGITS
     vergelijking
    0.08
     portátil
    0.08
    出去
    0.08
     envelop
    0.07
    pad
    0.07
     groot
    0.07
     fb
    0.07
     Ub
    0.07
    fw
    0.07
     grip
    0.07
    Act Density 0.221%

    No Known Activations