INDEX
    Explanations

    names of celebrities

    mentions of popular celebrities, particularly Kanye West

    New Auto-Interp
    Negative Logits
     suff
    -0.72
    Downloadha
    -0.71
    ettlement
    -0.70
    NetMessage
    -0.67
    dayName
    -0.67
    nesota
    -0.66
    thia
    -0.66
     Proced
    -0.65
    ajor
    -0.64
    llah
    -0.64
    POSITIVE LOGITS
     Kanye
    0.91
    anye
    0.81
     Kardashian
    0.79
    ipedia
    0.75
    mson
    0.72
    efully
    0.70
    pants
    0.70
    weather
    0.69
    reth
    0.67
    cé
    0.66
    Act Density 0.008%

    No Known Activations