INDEX
    Explanations

    instances of the letter 'z' in various contexts

    New Auto-Interp
    Negative Logits
    u
    -0.21
    z
    -0.21
    h
    -0.20
    ar
    -0.20
    w
    -0.20
    n
    -0.20
    v
    -0.18
    im
    -0.18
    b
    -0.18
    ap
    -0.17
    POSITIVE LOGITS
    odiac
    0.28
    ebra
    0.22
    onal
    0.20
    ircon
    0.20
    zz
    0.18
    oned
    0.18
    ipped
    0.18
    ipline
    0.17
    ucchini
    0.17
    witter
    0.17
    Act Density 0.011%

    No Known Activations