INDEX
    Explanations

    references to societal values and economic disparities

    New Auto-Interp
    Negative Logits
    åĽ
    -0.17
    odes
    -0.16
    abase
    -0.14
     Buccane
    -0.14
    aukee
    -0.14
    ktop
    -0.14
    anyak
    -0.14
     Bucc
    -0.14
    olet
    -0.14
     precondition
    -0.13
    POSITIVE LOGITS
     besides
    0.15
     Trou
    0.15
     Guar
    0.15
    کس
    0.15
    cone
    0.14
    (QIcon
    0.14
     Ide
    0.14
     concession
    0.14
     moral
    0.13
    prs
    0.13
    Act Density 0.286%

    No Known Activations