INDEX
    Explanations

    possessive language indicating ownership or belonging

    New Auto-Interp
    Negative Logits
    oth
    -0.18
    odge
    -0.17
    ig
    -0.16
    ech
    -0.16
    wise
    -0.16
    sex
    -0.15
    PELL
    -0.15
    ndon
    -0.14
    imore
    -0.14
    ritz
    -0.14
    POSITIVE LOGITS
    elves
    0.24
     own
    0.21
    zelf
    0.20
    /her
    0.19
    elay
    0.18
    gii
    0.17
    chaft
    0.17
    elon
    0.16
    æ¾
    0.16
    itable
    0.16
    Act Density 0.013%

    No Known Activations