INDEX
    Explanations

    This neuron activates on the word “multiple” (i.e. occurrences of “multiple” indicating more than one).

    New Auto-Interp
    Negative Logits
     eater
    -0.07
    -Mart
    -0.07
    říž
    -0.07
    -business
    -0.07
     intimidation
    -0.06
     Shel
    -0.06
     decoder
    -0.06
    ंश
    -0.06
     Succ
    -0.06
     tire
    -0.06
    POSITIVE LOGITS
    ']);
    0.07
     metaph
    0.06
     тех
    0.06
    しない
    0.06
    $con
    0.06
    ?</
    0.06
    void
    0.06
     ajout
    0.06
    нив
    0.06
    0.06
    Act Density 0.066%

    No Known Activations