INDEX
    Explanations

    references to political parties, specifically the Communist Party

    New Auto-Interp
    Negative Logits
    èĪĪ
    -0.15
    inant
    -0.15
    説
    -0.14
    imbus
    -0.14
    "group
    -0.13
    à¹Ĭà¸ģ
    -0.13
    جÙħÙĪØ¹
    -0.13
    ded
    -0.13
    zdy
    -0.13
    جÙĩ
    -0.13
    POSITIVE LOGITS
    rio
    0.20
    yle
    0.17
    опол
    0.16
    ero
    0.15
    tsx
    0.15
    ibo
    0.14
    teri
    0.14
    igr
    0.14
    itive
    0.14
    orial
    0.14
    Act Density 0.010%

    No Known Activations