INDEX
    Explanations

    possessive forms used to indicate ownership or affiliation

    New Auto-Interp
    Negative Logits
    asil
    -0.16
    Ãłm
    -0.16
    aler
    -0.16
    ccione
    -0.15
    aign
    -0.15
    aversable
    -0.15
     Franti
    -0.15
    ylko
    -0.14
     aks
    -0.14
    enate
    -0.14
    POSITIVE LOGITS
    们
    0.29
    åĢij
    0.21
    es
    0.19
     themselves
    0.18
    ws
    0.18
    ths
    0.17
    swith
    0.17
    ses
    0.16
    ubs
    0.15
    iones
    0.15
    Act Density 0.220%

    No Known Activations