INDEX
    Explanations

    phrases indicating contrast and comparison

    phrases indicating a comparison or contrast

    New Auto-Interp
    Negative Logits
    uble
    -0.74
    enos
    -0.72
    amon
    -0.67
    atile
    -0.66
    erest
    -0.65
     Clicker
    -0.63
    olis
    -0.62
    esome
    -0.62
    acha
    -0.61
    nce
    -0.60
    POSITIVE LOGITS
     preferably
    0.83
    etheless
    0.77
     excluding
    0.75
    cause
    0.75
     coerc
    0.72
     evidenced
    0.72
     preferring
    0.71
    ardless
    0.69
     including
    0.68
     theirs
    0.68
    Act Density 0.290%

    No Known Activations