INDEX
    Explanations

    the frequency of the pronoun 'I' in the text

    New Auto-Interp
    Negative Logits
    c
    -0.26
    p
    -0.25
    e
    -0.24
    orem
    -0.22
    b
    -0.21
    v
    -0.20
    x
    -0.19
    a
    -0.19
    z
    -0.19
    h
    -0.19
    POSITIVE LOGITS
    E
    0.23
    M
    0.21
    C
    0.20
    A
    0.19
    D
    0.19
    O
    0.19
    L
    0.19
    TRGL
    0.18
    P
    0.18
    N
    0.18
    Act Density 0.020%

    No Known Activations