INDEX
    Explanations

    mentions of social media actions or usernames

    indications of social media engagement

    New Auto-Interp
    Negative Logits
     thrust
    -0.79
    etheless
    -0.75
     tackling
    -0.69
     penal
    -0.68
     unsc
    -0.68
     stewards
    -0.66
    arers
    -0.64
     tug
    -0.64
     manoeuv
    -0.64
     poisoning
    -0.64
    POSITIVE LOGITS
    Associated
    1.08
    Follow
    1.03
    Anonymous
    1.00
    Original
    0.98
    FOR
    0.98
    OTHER
    0.97
    STER
    0.94
    EDIT
    0.92
    TOP
    0.91
    LET
    0.90
    Act Density 0.062%

    No Known Activations