INDEX
    Explanations

    possessive pronouns and their references

    New Auto-Interp
    Negative Logits
    hiba
    -0.17
    çļĦ大
    -0.16
    ãĤĵãģ©
    -0.15
     persons
    -0.14
    ingly
    -0.14
    ndef
    -0.14
    çļĦæīĭ
    -0.14
     Rarity
    -0.14
    azers
    -0.14
    luv
    -0.14
    POSITIVE LOGITS
    /her
    0.48
    panic
    0.34
    /she
    0.33
    sing
    0.29
    idi
    0.24
     himself
    0.23
    pter
    0.20
    zelf
    0.20
    avier
    0.20
     Majesty
    0.20
    Act Density 0.227%

    No Known Activations