INDEX
    Explanations

    instances of skepticism or disagreement

    expressions of uncertainty or controversy surrounding social issues

    New Auto-Interp
    Negative Logits
    ãĤ§
    -0.70
    ula
    -0.67
    alde
    -0.66
    bowl
    -0.64
    si
    -0.63
    orean
    -0.61
    ().
    -0.61
    atos
    -0.61
    omever
    -0.60
    Americ
    -0.60
    POSITIVE LOGITS
     nonetheless
    1.16
     nevertheless
    0.93
    etheless
    0.92
     cautioned
    0.74
     undeniably
    0.71
     concedes
    0.71
     curiously
    0.71
     balk
    0.69
     persists
    0.69
     elusive
    0.68
    Act Density 0.797%

    No Known Activations