INDEX
    Explanations

    phrases and terms related to human rights violations and organizations working to protect human rights

    New Auto-Interp
    Negative Logits
    agers
    -0.82
    opic
    -0.71
     ancest
    -0.69
    driver
    -0.66
    lass
    -0.64
    adal
    -0.64
    age
    -0.61
    pox
    -0.61
     stead
    -0.61
     fries
    -0.61
    POSITIVE LOGITS
    nesty
    1.01
     International
    0.87
    undo
    0.75
    International
    0.73
    ãĤ±
    0.67
     Machina
    0.66
     Chomsky
    0.66
    endi
    0.64
    Choice
    0.63
    ileaks
    0.63
    Act Density 0.021%

    No Known Activations