INDEX
    Explanations

    This neuron fires on words and short phrases used to introduce advice or recommendations (e.g. “start,” “try,” “make,” “use,” “consider”).

    New Auto-Interp
    Negative Logits
    essential
    -0.06
     punishments
    -0.06
    /misc
    -0.06
     CHANNEL
    -0.06
     açıklam
    -0.06
    jspb
    -0.06
    btn
    -0.06
     чт
    -0.06
    inverse
    -0.06
    .Hidden
    -0.06
    POSITIVE LOGITS
    0.07
    0.06
     έναν
    0.06
    na
    0.06
    вест
    0.06
     leží
    0.06
     Chavez
    0.06
    0.06
    šek
    0.06
    λμ
    0.06
    Act Density 0.057%

    No Known Activations