INDEX
    Explanations

    terms related to political discourse and ideology

    New Auto-Interp
    Negative Logits
    roje
    -0.16
    ungal
    -0.15
    ode
    -0.15
    vez
    -0.14
    itler
    -0.14
    ead
    -0.14
     Pvt
    -0.14
    udge
    -0.14
    oje
    -0.14
    ansen
    -0.14
    POSITIVE LOGITS
     incorrect
    0.19
    -economic
    0.19
    -admin
    0.18
     Parties
    0.17
     Incorrect
    0.17
    /admin
    0.17
    incorrect
    0.16
    atform
    0.15
     Gerr
    0.15
     correctness
    0.15
    Act Density 0.041%

    No Known Activations