INDEX
    Explanations

    possessive pronouns and related articles

    New Auto-Interp
    Negative Logits
    arat
    -0.17
     lies
    -0.16
    oder
    -0.15
     weather
    -0.15
     wom
    -0.15
     lying
    -0.15
    omm
    -0.14
    Ïįν
    -0.14
    935
    -0.14
    urable
    -0.14
    POSITIVE LOGITS
    ñana
    0.17
    eso
    0.15
    hud
    0.15
    /cop
    0.14
    NING
    0.14
    /Internal
    0.14
    ]--;↵
    0.14
    resh
    0.14
    usu
    0.14
    озем
    0.14
    Act Density 0.000%

    No Known Activations