INDEX
    Explanations

    terms related to various forms of oppression and injustice

    New Auto-Interp
    Negative Logits
    تÙĪÙĨ
    -0.16
     opposition
    -0.16
    olina
    -0.15
    ldr
    -0.15
    Animations
    -0.14
     opposite
    -0.14
     Opposition
    -0.14
    $MESS
    -0.14
    ually
    -0.14
    agr
    -0.14
    POSITIVE LOGITS
    zech
    0.15
    incident
    0.14
    447
    0.14
     incidental
    0.14
    WB
    0.14
    don
    0.14
     incident
    0.14
    ÂŃs
    0.13
     aug
    0.13
    UGIN
    0.13
    Act Density 0.014%

    No Known Activations