INDEX
    Explanations

    mentions of social media usernames and handles

    mentions of social media accounts and interactions with them

    New Auto-Interp
    Head Attr Weights
    0:0.17
    1:0.05
    2:0.07
    3:0.13
    4:0.04
    5:0.11
    6:0.07
    7:0.03
    8:0.11
    9:0.07
    10:0.07
    11:0.03
    Negative Logits
     ®
    -1.39
     virtues
    -1.30
     therein
    -1.29
     astronauts
    -1.26
     treasures
    -1.25
     arsenic
    -1.21
     pleasures
    -1.17
     Sodium
    -1.11
     fishes
    -1.11
    etheless
    -1.10
    POSITIVE LOGITS
    yp
    1.58
    union
    1.33
    riot
    1.29
    spr
    1.29
    record
    1.28
    oll
    1.26
    poll
    1.25
    nee
    1.24
    hz
    1.24
    gyn
    1.23
    Act Density 0.019%

    No Known Activations