INDEX
    Explanations

    occurrences of the pronoun "I"

    New Auto-Interp
    Negative Logits
    pedia
    -0.18
    INCLUDED
    -0.17
    a
    -0.16
    aling
    -0.16
    p
    -0.16
    e
    -0.16
    vro
    -0.15
    pars
    -0.15
    áºŃn
    -0.15
    onym
    -0.15
    POSITIVE LOGITS
    .e
    0.23
    E
    0.20
    L
    0.19
    omanip
    0.18
    G
    0.17
    M
    0.17
    ylland
    0.17
    F
    0.17
    N
    0.17
    D
    0.16
    Act Density 0.044%

    No Known Activations