INDEX
    Explanations

    lists of strengths and weaknesses

    New Auto-Interp
    Negative Logits
     injustices
    0.20
     denunci
    0.20
     prostit
    0.19
     renunci
    0.19
     injustice
    0.18
     wrongdoing
    0.18
     immoral
    0.18
     अन्याय
    0.18
     minorities
    0.18
     societal
    0.18
    POSITIVE LOGITS
    {
    0.18
     I
    0.17
    compatible
    0.16
    y
    0.16
    6
    0.16
    am
    0.16
    s
    0.16
    et
    0.16
    }\
    0.16
    which
    0.16
    Act Density 0.000%

    No Known Activations