INDEX
    Explanations

    possessive pronouns and expressions of personal ownership

    New Auto-Interp
    Negative Logits
     unanim
    -0.17
    s
    -0.16
    ories
    -0.15
    tero
    -0.15
    elder
    -0.14
    pig
    -0.14
    rů
    -0.14
    STITUTE
    -0.14
     aftermath
    -0.14
    icky
    -0.14
    POSITIVE LOGITS
    rtle
    0.32
    riad
    0.32
    opic
    0.29
    anmar
    0.29
    ri
    0.26
    opia
    0.25
    rrha
    0.25
     myself
    0.25
    ths
    0.24
    embros
    0.23
    Act Density 0.142%

    No Known Activations