INDEX
    Explanations

    the letter 'y' in various contexts

    New Auto-Interp
    Negative Logits
    o
    -0.28
    a
    -0.26
    u
    -0.25
    i
    -0.23
    r
    -0.22
    y
    -0.21
    t
    -0.21
    ay
    -0.20
    n
    -0.20
    an
    -0.19
    POSITIVE LOGITS
    achts
    0.27
    ea
    0.21
    oke
    0.21
    ester
    0.21
    anked
    0.19
    ean
    0.19
    سطس
    0.18
    oked
    0.18
    onder
    0.18
    olk
    0.17
    Act Density 0.018%

    No Known Activations