INDEX
    Explanations

    promotions or deals offering rewards or benefits

    phrases related to receiving benefits or rewards

    New Auto-Interp
    Negative Logits
    igr
    -0.82
    lie
    -0.69
    worthy
    -0.64
    eem
    -0.63
    Recomm
    -0.61
    abin
    -0.61
    bourne
    -0.60
    eli
    -0.60
    factor
    -0.59
     fallacy
    -0.58
    POSITIVE LOGITS
     rid
    1.11
     reimb
    1.03
     refunds
    0.94
     preferential
    0.94
     rewarded
    0.92
     bumped
    0.88
     access
    0.87
     compensated
    0.86
     paid
    0.85
     unlimited
    0.81
    Act Density 0.097%

    No Known Activations