INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     PTS
    -0.73
     Plaint
    -0.62
    tnc
    -0.59
     totality
    -0.58
     Malone
    -0.57
     relevance
    -0.57
    pires
    -0.56
     Ele
    -0.56
     Pearson
    -0.55
     antid
    -0.55
    POSITIVE LOGITS
    'm
    1.60
    've
    1.34
     dunno
    1.32
     suppose
    1.29
     guess
    1.26
    'll
    1.26
     swear
    1.16
    'd
    1.14
     am
    1.10
     wanna
    1.07
    Act Density 0.204%

    No Known Activations