INDEX
    Explanations

    negative sentiment

    The neuron detects German‐language words or word pieces.

    New Auto-Interp
    Negative Logits
    February
    -0.06
    Ye
    -0.06
    AGON
    -0.06
     seront
    -0.06
    ById
    -0.06
     boz
    -0.06
     STOP
    -0.06
    tes
    -0.06
    -0.06
     dit
    -0.06
    POSITIVE LOGITS
     tainted
    0.07
    °}
    0.07
    _OPENGL
    0.07
     OUTER
    0.07
    .xhtml
    0.07
     inferior
    0.07
    يلم
    0.06
     japan
    0.06
    uds
    0.06
     mas
    0.06
    Act Density 0.009%

    No Known Activations