INDEX
    Explanations

    references to individuals, groups, and communities in various contexts

    New Auto-Interp
    Negative Logits
    utin
    -0.16
    ento
    -0.15
    anders
    -0.15
    idor
    -0.14
    OAD
    -0.14
     DateTimeOffset
    -0.14
    arial
    -0.14
    ensa
    -0.13
    pered
    -0.13
    à¸ŀà¸ļ
    -0.13
    POSITIVE LOGITS
     whose
    0.28
     who
    0.26
    whose
    0.22
    who
    0.19
    اباÙĨ
    0.17
     koji
    0.15
     kteÅĻÃŃ
    0.15
     qui
    0.14
    بار
    0.14
    quet
    0.14
    Act Density 0.355%

    No Known Activations