INDEX
    Explanations

    instances of the pronoun "I" in various contexts

    New Auto-Interp
    Negative Logits
    ve
    -0.31
    t
    -0.28
    an
    -0.28
    l
    -0.27
    ke
    -0.26
    f
    -0.24
    r
    -0.24
    m
    -0.24
    n
    -0.24
    d
    -0.23
    POSITIVE LOGITS
    TERS
    0.17
    cntl
    0.16
    i
    0.16
    iÃŃ
    0.16
    mit
    0.16
    ãĤ¦ãĥ³
    0.15
    eee
    0.15
    AU
    0.15
    udic
    0.15
    ADE
    0.15
    Act Density 0.053%

    No Known Activations