INDEX
    Explanations

    expressions of personal relationships and social dynamics

    New Auto-Interp
    Negative Logits
     esso
    -0.75
     dets
    -0.68
    ньому
    -0.63
     Оно
    -0.46
     []*
    -0.43
     оно
    -0.42
    Rujuakan
    -0.40
    ILLES
    -0.40
     فيه
    -0.39
    gevens
    -0.39
    POSITIVE LOGITS
     she
    3.59
     her
    2.80
     그녀
    2.73
    2.45
     hennes
    2.38
    彼女は
    2.36
     její
    2.31
    彼女の
    2.30
    เธอ
    2.27
    she
    2.25
    Act Density 1.769%

    No Known Activations