INDEX
    Explanations

    social media or news-related content like tweets and news updates

    New Auto-Interp
    Negative Logits
     audits
    -0.71
     involuntary
    -0.66
     indemn
    -0.65
     confidentiality
    -0.63
     volunt
    -0.62
     subsistence
    -0.61
     tsun
    -0.61
     disadvant
    -0.60
     conformity
    -0.59
     settlements
    -0.58
    POSITIVE LOGITS
    twitter
    1.65
    facebook
    1.14
    imgur
    1.09
    google
    1.06
    twitch
    1.01
    redd
    0.96
    youtube
    0.93
    reddit
    0.93
    blogspot
    0.90
    github
    0.89
    Act Density 0.014%

    No Known Activations