INDEX
    Explanations

    possessive pronouns and related attributes

    New Auto-Interp
    Negative Logits
    uzzi
    -0.15
    etur
    -0.15
    aal
    -0.15
    ilst
    -0.14
     ÎľÎŃ
    -0.14
     suff
    -0.14
    uil
    -0.14
    ños
    -0.14
    ulfilled
    -0.14
    intree
    -0.14
    POSITIVE LOGITS
     gonna
    0.26
     been
    0.22
     not
    0.20
     afraid
    0.19
     gone
    0.18
    gon
    0.18
    'e
    0.18
     going
    0.17
     done
    0.17
    ’e
    0.16
    Act Density 0.097%

    No Known Activations