INDEX
    Explanations

    references to social roles and identities

    New Auto-Interp
    Negative Logits
    zers
    -0.17
     stuff
    -0.16
    isko
    -0.16
    ions
    -0.16
     Blonde
    -0.15
    ods
    -0.14
     Jou
    -0.14
    awe
    -0.14
    ilden
    -0.14
    ald
    -0.14
    POSITIVE LOGITS
     unto
    0.23
     capable
    0.19
     who
    0.17
     able
    0.16
     extra
    0.16
    /Area
    0.14
    ÑĢеб
    0.14
     reb
    0.14
     बनन
    0.14
    ician
    0.14
    Act Density 0.226%

    No Known Activations