INDEX
    Explanations

    phrases indicating disagreement or opposition

    statements regarding consumer choices and political stances on social issues

    New Auto-Interp
    Negative Logits
    WAR
    -0.59
     Released
    -0.59
     RL
    -0.58
     NYC
    -0.55
    Born
    -0.53
     Weird
    -0.53
    agonists
    -0.53
    Pg
    -0.52
     Yor
    -0.52
     Whedon
    -0.52
    POSITIVE LOGITS
    )."
    1.29
    .")
    1.14
    .""
    1.13
    .'"
    1.11
    ."[
    1.01
    '."
    1.00
     â̦"
    0.99
    ."
    0.99
     ..."
    0.96
    }"
    0.96
    Act Density 1.324%

    No Known Activations