INDEX
    Explanations

    phrases showing empathy, support, or concern for different groups of people

    contexts related to global impact and collective consequences

    New Auto-Interp
    Negative Logits
    etting
    -0.82
    oaded
    -0.73
    worn
    -0.71
    Vers
    -0.62
    pmwiki
    -0.62
    illac
    -0.61
    oult
    -0.60
    imilar
    -0.58
     disse
    -0.58
    utenberg
    -0.58
    POSITIVE LOGITS
     humankind
    1.12
     wider
    1.10
     society
    1.04
     mankind
    1.03
     everybody
    1.02
     anybody
    0.97
     everyone
    0.96
     broader
    0.95
     humanity
    0.95
     entire
    0.92
    Act Density 0.281%

    No Known Activations