INDEX
    Explanations

    references to personal identity and ownership

    possessive pronouns and self-references

    New Auto-Interp
    Negative Logits
     يتيمه
    -0.59
    SuppressLint
    -0.46
    
    -0.46
    irited
    -0.45
     ſche
    -0.44
     оригіналу
    -0.43
    {#
    -0.43
     समीक्षाओं
    -0.43
     meisje
    -0.43
    -0.43
    POSITIVE LOGITS
     own
    0.56
    自己
    0.51
    自己是
    0.50
    MessageTagHelper
    0.47
    自分は
    0.47
    自己的
    0.47
    自己在
    0.47
     فريبيس
    0.45
    themselves
    0.43
     themselves
    0.42
    Act Density 0.069%

    No Known Activations