INDEX
    Explanations

    The neuron detects hedge or cautionary language that signals risk, likelihood, or necessity.

    New Auto-Interp
    Negative Logits
    ه
    -0.08
     nargs
    -0.07
    ours
    -0.07
    Ô
    -0.07
    #elif
    -0.07
    ivered
    -0.07
     Load
    -0.07
    GHz
    -0.07
    TES
    -0.07
     champions
    -0.07
    POSITIVE LOGITS
     A
    0.07
     a
    0.07
     The
    0.06
    にか
    0.06
    енная
    0.06
    osci
    0.06
    727
    0.06
     aberr
    0.06
    тый
    0.06
    Soon
    0.06
    Act Density 0.033%

    No Known Activations