INDEX
    Explanations

    words related to opinions or actions being unpopular

    references to unpopularity

    New Auto-Interp
    Negative Logits
    EStreamFrame
    -0.81
    erning
    -0.78
    ramid
    -0.76
    initely
    -0.75
    hens
    -0.75
    chn
    -0.74
    llular
    -0.72
    utics
    -0.71
    arnaev
    -0.71
    ynthesis
    -0.70
    POSITIVE LOGITS
    ity
    1.19
     unpopular
    1.12
     incumbent
    0.94
    ities
    0.89
     majorities
    0.82
     burdens
    0.76
     incumb
    0.75
    nesses
    0.74
    lihood
    0.73
     taboo
    0.72
    Act Density 0.021%

    No Known Activations