INDEX
    Explanations

    phrases related to coercion or manipulation

    phrases that describe manipulation or coercion

    New Auto-Interp
    Negative Logits
    thora
    -0.68
    tm
    -0.67
    hetically
    -0.64
    day
    -0.62
    cu
    -0.62
    rike
    -0.62
    fred
    -0.61
    entimes
    -0.60
    ener
    -0.60
    idates
    -0.59
    POSITIVE LOGITS
     believing
    1.27
     submission
    1.23
     agreeing
    1.08
     accepting
    1.07
     buying
    1.00
     submitting
    0.99
     adopting
    0.98
     thinking
    0.97
     abandoning
    0.96
     committing
    0.96
    Act Density 0.077%

    No Known Activations