INDEX
    Explanations

    occurrences of possessive pronouns and related phrases

    New Auto-Interp
    Negative Logits
     '
    -0.16
     majority
    -0.15
     [
    -0.15
    s
    -0.15
     hem
    -0.15
    éĨ´
    -0.15
    ItemImage
    -0.15
    loor
    -0.15
    ore
    -0.15
     com
    -0.14
    POSITIVE LOGITS
    edom
    0.17
    eniz
    0.16
    Uvs
    0.15
    ften
    0.15
    IFn
    0.14
     âĹĦ
    0.14
    -fw
    0.14
    CharacterSet
    0.14
    uptools
    0.14
    .shiro
    0.14
    Act Density 0.019%

    No Known Activations