INDEX
    Explanations

    occurrences of first-person references or personal pronouns

    New Auto-Interp
    Negative Logits
     Whe
    -0.15
    IPA
    -0.15
    екÑĤоÑĢ
    -0.15
     rowNum
    -0.14
    çµ
    -0.14
     öld
    -0.14
    à¥ĭह
    -0.13
     Fro
    -0.13
     displ
    -0.13
    ois
    -0.13
    POSITIVE LOGITS
    .leading
    0.17
    Leading
    0.15
     leading
    0.15
    andler
    0.15
    hle
    0.15
    ogan
    0.14
    _MUT
    0.14
    ABLE
    0.14
    inline
    0.14
    amba
    0.14
    Act Density 0.017%

    No Known Activations