INDEX
    Explanations

    possessive pronouns and their associated references

    New Auto-Interp
    Negative Logits
    ernote
    -0.17
    Wunused
    -0.16
    xdd
    -0.15
    _mD
    -0.15
     вÑģп
    -0.14
    udu
    -0.14
     grandchildren
    -0.14
    ellas
    -0.14
    (éĩij
    -0.14
     Jaune
    -0.14
    POSITIVE LOGITS
     former
    0.23
     
    0.20
     fellow
    0.19
     friend
    0.18
    aho
    0.18
     co
    0.17
     another
    0.17
     colleague
    0.16
     previous
    0.16
    ada
    0.16
    Act Density 0.065%

    No Known Activations