INDEX
    Explanations

    The neuron detects small‐degree modifiers—words and phrases that qualify or hedge a description (e.g. “slightly,” “uneven,” “small,” “slight”).

    New Auto-Interp
    Negative Logits
    upport
    -0.07
    (Network
    -0.07
     Yönetim
    -0.07
     Fu
    -0.07
     Yu
    -0.07
    u
    -0.07
    -den
    -0.07
     Commerce
    -0.07
    -msg
    -0.06
    dou
    -0.06
    POSITIVE LOGITS
     slightly
    0.17
     slight
    0.11
    lightly
    0.07
    0.07
    0.07
     falsely
    0.07
     slightest
    0.07
    개의
    0.07
     gently
    0.07
     altered
    0.07
    Act Density 0.007%

    No Known Activations