INDEX
    Explanations

    negative consequences and experiences related to societal behavior and injustice

    New Auto-Interp
    Negative Logits
    anced
    -0.15
    Comparer
    -0.14
    otros
    -0.13
    uator
    -0.13
    osphere
    -0.12
    utar
    -0.12
    vanished
    -0.12
    *)((
    -0.12
    arine
    -0.12
     kone
    -0.12
    POSITIVE LOGITS
    /null
    0.18
    ieber
    0.15
     Gunn
    0.14
    rud
    0.14
    ulence
    0.14
    ElementException
    0.14
    byss
    0.14
    eczy
    0.13
    edback
    0.13
    usra
    0.13
    Act Density 0.526%

    No Known Activations