INDEX
    Explanations

    references to political ideologies, particularly left-wing and right-wing distinctions

    New Auto-Interp
    Negative Logits
    FX
    -0.15
    gn
    -0.14
    azi
    -0.14
    ubre
    -0.14
    leston
    -0.14
     Edgar
    -0.14
    procs
    -0.14
     bou
    -0.14
    ãĥĭãĥ¼
    -0.13
    icip
    -0.13
    POSITIVE LOGITS
    yal
    0.16
    licant
    0.15
    flen
    0.14
    WARD
    0.14
     muschi
    0.14
    aticon
    0.14
    sworth
    0.13
    vron
    0.13
    olina
    0.13
    artz
    0.13
    Act Density 0.022%

    No Known Activations