INDEX
    Explanations

    terms related to human rights organizations and activities

    New Auto-Interp
    Negative Logits
    ckt
    -0.15
     inval
    -0.15
    acro
    -0.14
    BA
    -0.13
    apol
    -0.13
    ÃŃr
    -0.13
    ewis
    -0.13
    ektor
    -0.13
    anus
    -0.13
    yro
    -0.13
    POSITIVE LOGITS
    oli
    0.16
    reds
    0.16
    paque
    0.15
    czy
    0.15
    vil
    0.14
    niej
    0.14
    843
    0.14
    abled
    0.14
    ient
    0.14
    att
    0.14
    Act Density 0.012%

    No Known Activations