INDEX
    Explanations

    possessive pronouns

    New Auto-Interp
    Negative Logits
     on
    -0.08
    -on
    -0.07
     scalar
    -0.07
     to
    -0.07
     onto
    -0.07
     align
    -0.07
    рос
    -0.07
     TO
    -0.07
    生成
    -0.07
     from
    -0.06
    POSITIVE LOGITS
     my
    0.08
     their
    0.08
    's
    0.08
    ’s
    0.07
     her
    0.07
     nevy
    0.07
     Its
    0.07
     in
    0.07
     My
    0.07
     your
    0.07
    Act Density 0.069%

    No Known Activations