INDEX
    Explanations

    contractions or possessive forms

    New Auto-Interp
    Negative Logits
    .syntax
    -0.16
    jaw
    -0.15
    ere
    -0.15
    akter
    -0.14
     Byrne
    -0.14
    blogs
    -0.14
     ëij
    -0.14
    å¹ķ
    -0.14
     Sokol
    -0.14
    itive
    -0.14
    POSITIVE LOGITS
    DEX
    0.16
    istrov
    0.15
    uteur
    0.14
    мп
    0.14
    μεν
    0.13
     pct
    0.13
    ัà¸ŀà¸Ĺ
    0.13
    UY
    0.13
    /ts
    0.13
    ovich
    0.13
    Act Density 0.089%

    No Known Activations