INDEX
    Explanations

    words related to criticism or negative judgment

    terms that convey negative judgments or criticisms

    New Auto-Interp
    Negative Logits
    vals
    -0.71
    chance
    -0.68
    quart
    -0.65
    semble
    -0.65
    rollers
    -0.63
    interrupted
    -0.62
    eon
    -0.61
    ngth
    -0.61
     Cups
    -0.60
    asio
    -0.60
    POSITIVE LOGITS
     underest
    0.88
     enough
    0.85
     underestimate
    0.79
     overest
    0.74
     hypocr
    0.73
     because
    0.72
    uably
    0.71
     grounds
    0.71
     hypocrisy
    0.70
     exagger
    0.70
    Act Density 0.165%

    No Known Activations