INDEX
    Explanations

    expressions of pleading or requests for help

    New Auto-Interp
    Negative Logits
    hausen
    -0.16
    edef
    -0.16
    eyh
    -0.16
    _fx
    -0.15
    oulos
    -0.14
     Sunder
    -0.14
    anki
    -0.14
    atts
    -0.14
    evin
    -0.14
    elon
    -0.14
    POSITIVE LOGITS
     beg
    0.19
    beg
    0.18
     Permission
    0.18
     permission
    0.17
     Beg
    0.17
    gary
    0.17
     mercy
    0.17
     release
    0.16
    ging
    0.16
     begging
    0.16
    Act Density 0.030%

    No Known Activations