INDEX
    Explanations

    It flags the assistant’s stock clarification prompt—phrases like “Do you have any specific questions…?” asking the user for more details.

    New Auto-Interp
    Negative Logits
    807
    -0.07
    (['/
    -0.06
    Numbers
    -0.06
    ,next
    -0.06
    atak
    -0.06
    ̣
    -0.06
     leftist
    -0.06
    edef
    -0.06
     chiefly
    -0.06
     REUTERS
    -0.06
    POSITIVE LOGITS
     grupos
    0.07
     connect
    0.06
     Ne
    0.06
    among
    0.06
    invert
    0.06
     Spr
    0.06
    keit
    0.06
     düz
    0.06
     darkness
    0.06
    0.06
    Act Density 0.076%

    No Known Activations