INDEX
    Explanations

    phrases indicating surprise or disbelief

    phrases expressing incredulity or emphasizing a lack of something

    New Auto-Interp
    Negative Logits
    rend
    -0.82
    Aren
    -0.69
    _-
    -0.68
    ãĤ¿
    -0.67
    ahime
    -0.64
    ller
    -0.64
    plex
    -0.63
    ollen
    -0.63
    cel
    -0.63
    ilt
    -0.62
    POSITIVE LOGITS
     remotely
    1.22
     bother
    0.82
     anymore
    0.75
     slightest
    0.72
     bothered
    0.72
     though
    0.71
     mention
    0.68
     tho
    0.68
     bothering
    0.68
     outright
    0.65
    Act Density 0.044%

    No Known Activations