INDEX
    Explanations

    words related to social justice issues

    New Auto-Interp
    Negative Logits
    çīĪ
    -0.81
    cade
    -0.80
    ulum
    -0.74
    liest
    -0.74
    amera
    -0.72
    ixel
    -0.72
    ainer
    -0.71
    eters
    -0.70
    gue
    -0.70
    itatively
    -0.69
    POSITIVE LOGITS
     persecution
    1.07
     extremism
    1.06
     violence
    1.05
     degradation
    1.02
     aggression
    1.01
     terrorism
    1.01
     sexism
    1.01
     criminality
    1.00
     mayhem
    1.00
     misinformation
    1.00
    Act Density 1.621%

    No Known Activations