INDEX
    Explanations

    actions that indicate assistance or enhancement in various contexts

    New Auto-Interp
    Negative Logits
    their
    -0.23
    they
    -0.22
    sWith
    -0.20
    swith
    -0.19
    the
    -0.19
    that
    -0.18
     yourselves
    -0.18
    those
    -0.18
    able
    -0.18
    's
    -0.17
    POSITIVE LOGITS
     itself
    0.30
    heets
    0.20
    cales
    0.19
    ided
    0.19
    0.18
     Ñģобой
    0.18
    /is
    0.17
    '
    0.17
    boro
    0.16
    OwnProperty
    0.16
    Act Density 0.757%

    No Known Activations