INDEX
    Explanations

    Twitter usernames

    the presence of social media handles or mentions

    New Auto-Interp
    Negative Logits
     Icar
    -0.63
     consolidation
    -0.61
     partly
    -0.61
     coax
    -0.58
     Takeru
    -0.55
     cigarette
    -0.53
     sterile
    -0.53
     stricken
    -0.53
     wholes
    -0.53
     dirt
    -0.53
    POSITIVE LOGITS
     (@
    4.12
     ðŁ
    1.62
    ðŁ
    1.52
     @
    1.50
    ï¸ı
    1.46
     tweeted
    1.39
     "@
    1.38
     (#
    1.37
     âľ
    1.35
     ðŁij
    1.35
    Act Density 0.023%

    No Known Activations