INDEX
    Explanations

    relationships and personal connections

    New Auto-Interp
    Negative Logits
     own
    -0.17
    own
    -0.17
    exo
    -0.17
     ones
    -0.16
    ],[-
    -0.15
     Own
    -0.15
    idth
    -0.15
     Others
    -0.14
    ien
    -0.14
    union
    -0.14
    POSITIVE LOGITS
     mine
    0.46
     ours
    0.38
    mine
    0.35
    Mine
    0.33
     hers
    0.33
     theirs
    0.32
     Mine
    0.32
     mines
    0.30
     yours
    0.28
     Yours
    0.25
    Act Density 0.034%

    No Known Activations