INDEX
    Explanations

    words related to advertisements like "poster" and "bragging"

    references to posters and related imagery

    New Auto-Interp
    Negative Logits
    estial
    -0.81
     Ago
    -0.78
    %]
    -0.77
    ESSION
    -0.74
    hews
    -0.73
    owship
    -0.71
    IELD
    -0.67
     Liberties
    -0.67
    IVE
    -0.67
    efe
    -0.66
    POSITIVE LOGITS
     poster
    1.07
     posters
    0.99
    iors
    0.88
     flyer
    0.84
    onymous
    0.81
     Poster
    0.79
    ieu
    0.79
    pillar
    0.77
    ity
    0.73
     flyers
    0.72
    Act Density 0.014%

    No Known Activations