INDEX
    Explanations

    phrases indicating disagreement

    instances of the word "disagree."

    New Auto-Interp
    Negative Logits
     Roads
    -0.74
     Jackets
    -0.73
    recorded
    -0.71
    pmwiki
    -0.69
    ammy
    -0.69
    GV
    -0.67
    eval
    -0.66
    examination
    -0.66
    ams
    -0.65
    Pros
    -0.64
    POSITIVE LOGITS
     disagree
    1.25
     disagrees
    0.91
    rences
    0.90
     disagreement
    0.87
    edIn
    0.84
     disagreed
    0.82
     disagreements
    0.79
     opinions
    0.78
     agre
    0.75
    uous
    0.75
    Act Density 0.010%

    No Known Activations