INDEX
    Explanations

    pronouns and references to individuals in discussions around accountability and social dynamics

    New Auto-Interp
    Negative Logits
     Hamp
    -0.16
    éĻ
    -0.16
    epam
    -0.16
    令
    -0.15
    hle
    -0.14
    urch
    -0.14
    .CopyTo
    -0.14
    ãĥªãĥ¼ãĤº
    -0.14
    rani
    -0.14
    iras
    -0.14
    POSITIVE LOGITS
    okud
    0.17
    amaha
    0.17
    CAA
    0.15
    lady
    0.15
     jclass
    0.15
    logic
    0.15
     Nagar
    0.14
    esh
    0.14
    ror
    0.14
    ESH
    0.14
    Act Density 0.080%

    No Known Activations