INDEX
    Explanations

    negative statements or contradictions, where the latter part of the sentence contradicts the earlier part

    negations and expressions of impossibility or inadequacy

    New Auto-Interp
    Negative Logits
    only
    -0.89
     merely
    -0.82
     doubtless
    -0.73
     PLUS
    -0.66
     chiefly
    -0.66
     unintentionally
    -0.65
    rimination
    -0.64
     alternatively
    -0.64
     inadvertently
    -0.64
     falsely
    -0.63
    POSITIVE LOGITS
     darn
    0.77
     fuckin
    0.76
    wow
    0.74
    enough
    0.70
    iability
    0.67
    hin
    0.67
     enough
    0.66
    íķ
    0.65
    Õ
    0.64
     fucking
    0.63
    Act Density 0.068%

    No Known Activations