INDEX
    Explanations

    terms related to promotions and advertisements

    New Auto-Interp
    Negative Logits
    ght
    -0.51
    ReusableCell
    -0.51
     kö
    -0.51
    cang
    -0.50
     contr
    -0.50
    retweeted
    -0.49
     diverse
    -0.48
    αυ
    -0.48
    HALT
    -0.48
     enver
    -0.48
    POSITIVE LOGITS
     promotional
    1.15
    promo
    1.15
    Promotional
    1.11
     promo
    1.10
    TagMode
    1.10
     promotions
    1.03
    Promo
    1.02
     promos
    0.99
     Promotional
    0.98
    promotion
    0.97
    Act Density 0.162%

    No Known Activations