INDEX
    Explanations

    words related to news articles, updates, and information

    engagement prompts and advertising content

    New Auto-Interp
    Negative Logits
     respectively
    -0.53
     mechanically
    -0.51
     fundamentals
    -0.48
     alone
    -0.48
     fame
    -0.47
     Valve
    -0.47
     partying
    -0.46
     aesthetic
    -0.46
     indifferent
    -0.45
     deciding
    -0.45
    POSITIVE LOGITS
     UNCLASSIFIED
    0.83
     WATCHED
    0.75
    CNN
    0.74
     POLIT
    0.73
    News
    0.72
    Politics
    0.71
     POLITICO
    0.70
    embed
    0.68
     Transcript
    0.67
    politics
    0.65
    Act Density 0.874%

    No Known Activations