INDEX
    Explanations

    mentions of celebrities

    mentions and discussions of celebrities

    New Auto-Interp
    Negative Logits
    choes
    -0.87
    ¼
    -0.84
    anus
    -0.83
    hematic
    -0.81
    THER
    -0.79
    ¾
    -0.76
    ¸
    -0.75
    ²¾
    -0.73
    Ģ
    -0.73
    tered
    -0.71
    POSITIVE LOGITS
    rities
    1.11
     endorsements
    1.06
     endors
    1.05
     gossip
    1.00
     chef
    0.96
     chefs
    0.87
     nude
    0.78
    wcs
    0.77
     feud
    0.77
     idols
    0.77
    Act Density 0.048%

    No Known Activations