INDEX
    Explanations

    references to community involvement and support within diverse groups

    New Auto-Interp
    Negative Logits
     ppl
    -0.17
     itself
    -0.17
     Ñıке
    -0.17
     коÑĤоÑĢое
    -0.16
     somebody
    -0.16
     someone
    -0.16
     anybody
    -0.15
     anyone
    -0.15
    orem
    -0.15
    大家
    -0.14
    POSITIVE LOGITS
     whom
    0.43
     who
    0.28
     backgrounds
    0.25
    who
    0.24
     whose
    0.24
     opposite
    0.21
     Generation
    0.20
    whose
    0.19
     various
    0.19
     different
    0.19
    Act Density 0.306%

    No Known Activations