INDEX
    Explanations

    phrases indicating invitations, gratitude, and proposals

    New Auto-Interp
    Negative Logits
    ien
    -0.16
    jd
    -0.16
    oir
    -0.15
    _PLATFORM
    -0.15
    _CM
    -0.14
    çIJĨ
    -0.14
    elman
    -0.14
    iece
    -0.14
    zes
    -0.14
    ceed
    -0.14
    POSITIVE LOGITS
    shiv
    0.17
    endon
    0.15
    ILES
    0.15
    ãĥ©ãĥ³ãĤ¹
    0.15
    esson
    0.14
     tetas
    0.14
    ibal
    0.14
    dzi
    0.14
    iddet
    0.14
    ibili
    0.14
    Act Density 0.034%

    No Known Activations