INDEX
    Explanations

    Common English words

    The neuron responds to words expressing recommendations or instructions (e.g. “should,” “need,” “making,” “please”).

    New Auto-Interp
    Negative Logits
    |↵↵
    -0.07
    !*
    -0.07
    /******/↵
    -0.06
    >())↵
    -0.06
    -----------↵↵
    -0.06
    ?>'
    -0.06
    HOME
    -0.06
    ::↵↵
    -0.06
    -0.06
    >()↵↵
    -0.06
    POSITIVE LOGITS
     이동
    0.06
    ife
    0.06
    ويك
    0.06
     yüzde
    0.06
     prostřed
    0.06
     downtown
    0.06
    ampilkan
    0.06
    venient
    0.06
    UCCESS
    0.06
     Responsible
    0.06
    Act Density 0.210%

    No Known Activations