INDEX
    Explanations

    phrases related to things or people being unpopular

    references to unpopularity

    New Auto-Interp
    Negative Logits
    chn
    -0.90
    urers
    -0.85
    ramid
    -0.83
    uther
    -0.75
    CONCLUS
    -0.74
    ult
    -0.73
    EStreamFrame
    -0.71
    hens
    -0.71
    tein
    -0.71
    anwhile
    -0.70
    POSITIVE LOGITS
     unpopular
    1.23
    ity
    1.20
     incumbent
    0.88
    ities
    0.87
    liest
    0.83
     burdens
    0.83
     disadvant
    0.78
    lihood
    0.73
     partisan
    0.73
     plag
    0.72
    Act Density 0.014%

    No Known Activations