INDEX
    Explanations

    references to societal issues and social justice

    references to marginalized or affected groups of people

    New Auto-Interp
    Negative Logits
    kamp
    -0.77
    ob
    -0.72
    opoly
    -0.70
    ¨
    -0.68
    osate
    -0.68
    onis
    -0.66
    ointment
    -0.65
    escription
    -0.65
    ILY
    -0.65
    orate
    -0.64
    POSITIVE LOGITS
     who
    1.17
     wishing
    1.15
     pesky
    1.00
     entrusted
    0.94
     whom
    0.92
    who
    0.92
     tasked
    0.89
     fortunate
    0.89
     interested
    0.87
     involved
    0.87
    Act Density 0.071%

    No Known Activations