INDEX
    Explanations

    conversation turn markers

    markers signaling chat metadata and structure, such as message headers, role tags, and conversation boundary tokens.

    New Auto-Interp
    Negative Logits
    998
    -0.08
    ujet
    -0.08
    ecure
    -0.08
     öt
    -0.08
    ulton
    -0.08
     Moo
    -0.08
    290
    -0.07
    afari
    -0.07
     Placeholder
    -0.07
    reet
    -0.07
    POSITIVE LOGITS
     actually
    0.08
     ÑĤабли
    0.08
    (æľ¨
    0.08
     zwar
    0.08
    оÑĢаÑı
    0.08
    setQuery
    0.08
    ellite
    0.08
    ,},\n
    0.08
    aley
    0.08
     actual
    0.08
    Act Density 0.454%

    No Known Activations