INDEX
    Explanations

    phrases related to social justice and human rights issues

    New Auto-Interp
    Negative Logits
    issen
    -0.16
    ategy
    -0.15
    ervas
    -0.15
    asz
    -0.14
    otlin
    -0.14
    lient
    -0.14
    outine
    -0.14
    IFn
    -0.14
    ìĪł
    -0.14
    .gov
    -0.13
    POSITIVE LOGITS
     some
    0.37
     critics
    0.37
     many
    0.37
     experts
    0.31
     observers
    0.30
     detr
    0.29
    Crit
    0.28
    many
    0.27
    some
    0.27
     opponents
    0.26
    Act Density 0.290%

    No Known Activations