INDEX
    Explanations

    words and phrases related to public statements, backlash, and controversy surrounding social issues

    New Auto-Interp
    Negative Logits
    GraphicsUnit
    -0.77
    Espèce
    -0.76
    хьтан
    -0.72
     AssemblyVersion
    -0.71
    awsze
    -0.69
    AndEndTag
    -0.69
    didSet
    -0.66
    دانشنامهٔ
    -0.62
    ArgsConstructor
    -0.62
    Parcelize
    -0.61
    POSITIVE LOGITS
     tweeted
    1.91
     tweet
    1.84
     Twitter
    1.82
     tweeting
    1.81
     posting
    1.71
     tweets
    1.68
     posted
    1.66
     twitter
    1.56
    Twitter
    1.54
     Tweet
    1.46
    Act Density 0.161%

    No Known Activations