INDEX
    Explanations

    Dialogue/conversational

    Tokens that are part of the assistant's generated output (i.e., the assistant role / response text).

    New Auto-Interp
    Negative Logits
    19
    -0.08
     beats
    -0.07
    someone
    -0.06
    60
    -0.06
    uro
    -0.06
    membership
    -0.06
    30
    -0.06
    28
    -0.06
    Occ
    -0.06
    20
    -0.06
    POSITIVE LOGITS
    ��
    0.08
     hôn
    0.07
    ิร
    0.07
    Не
    0.07
    .setContentType
    0.06
    setFont
    0.06
     ruku
    0.06
     etwa
    0.06
    ább
    0.06
     luego
    0.06
    Act Density 0.069%

    No Known Activations