INDEX
    Explanations

    references to social issues and inequalities regarding access or resources

    New Auto-Interp
    Negative Logits
    orno
    -0.14
    .rl
    -0.14
    doch
    -0.14
    rapped
    -0.14
    IMER
    -0.14
    furt
    -0.14
    CHANT
    -0.14
    idla
    -0.14
    edition
    -0.14
    zeros
    -0.14
    POSITIVE LOGITS
     who
    0.66
    who
    0.53
     qui
    0.41
     quien
    0.40
     Who
    0.40
     whose
    0.37
    Who
    0.36
    è°ģ
    0.34
     whom
    0.32
     кÑĤо
    0.32
    Act Density 0.321%

    No Known Activations