INDEX
    Explanations

    specific political figures and their actions or statements

    New Auto-Interp
    Negative Logits
    arov
    -0.21
    aday
    -0.17
    caler
    -0.16
    COPE
    -0.16
    Ń
    -0.16
    dff
    -0.15
    arent
    -0.15
    neh
    -0.14
    иÑģÑĮ
    -0.14
    anna
    -0.14
    POSITIVE LOGITS
    igor
    0.15
    reta
    0.15
     Чи
    0.14
    671
    0.14
    oric
    0.14
    ILT
    0.14
    RenderWindow
    0.14
    igma
    0.14
    rosso
    0.14
     JNIEnv
    0.14
    Act Density 0.047%

    No Known Activations