INDEX
    Explanations

    negative sentiments and criticisms related to injustice or inequality

    New Auto-Interp
    Negative Logits
    italize
    -0.15
    locate
    -0.15
    ména
    -0.15
    racat
    -0.15
    exampleInput
    -0.15
    ulates
    -0.15
    ileen
    -0.15
    ieves
    -0.14
    ysters
    -0.14
    cedes
    -0.14
    POSITIVE LOGITS
    ulous
    0.19
    orous
    0.18
    arious
    0.16
    emonic
    0.16
    emic
    0.16
    ful
    0.16
    -than
    0.15
    icrous
    0.15
    eful
    0.15
    ellation
    0.15
    Act Density 0.516%

    No Known Activations