INDEX
    Explanations

    adjectives and nouns related to political views or actions

    terms related to political and social critiques

    New Auto-Interp
    Negative Logits
    lished
    -0.75
    izable
    -0.71
     shortened
    -0.69
    ually
    -0.68
    ized
    -0.68
    Rated
    -0.67
    ORED
    -0.67
     suspended
    -0.67
    ically
    -0.65
    FUL
    -0.65
    POSITIVE LOGITS
    ieties
    1.22
    isms
    1.17
    acies
    1.16
    usions
    1.14
    ographies
    1.10
    izons
    1.10
    iances
    1.08
    tones
    1.08
    vironments
    1.06
    rities
    1.06
    Act Density 0.561%

    No Known Activations