INDEX
    Explanations

    phrases indicating uncertainty or speculation

    negative assessments or perceptions of various subjects

    New Auto-Interp
    Negative Logits
    pez
    -0.93
    instead
    -0.78
     instead
    -0.77
    itton
    -0.72
    ilts
    -0.69
     nonetheless
    -0.68
     doubtless
    -0.66
    éĹĺ
    -0.66
     undoubtedly
    -0.64
     Rouge
    -0.63
    POSITIVE LOGITS
     anymore
    1.26
     bothered
    1.04
     remotely
    1.00
     nor
    0.99
     anywhere
    0.96
     anything
    0.95
     any
    0.95
     necessarily
    0.94
     whatsoever
    0.94
     terribly
    0.89
    Act Density 0.123%

    No Known Activations