INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Users
    -0.96
     USERS
    -0.85
    users
    -0.85
    łach
    -0.84
     Users
    -0.81
    ulets
    -0.79
    用户
    -0.75
    elang
    -0.75
    -0.74
     води
    -0.74
    POSITIVE LOGITS
     person
    3.56
     people
    3.19
     Person
    3.13
     persons
    2.89
    Person
    2.67
     PERSON
    2.55
    person
    2.53
    PERSON
    2.39
     personnes
    2.33
     People
    2.25
    Act Density 0.033%

    No Known Activations