INDEX
    Explanations

    occurrences of the letter 'R' in various contexts

    New Auto-Interp
    Negative Logits
    adius
    -0.25
    untime
    -0.22
    andom
    -0.21
    udy
    -0.20
    aise
    -0.20
    ounds
    -0.19
    adio
    -0.19
    adeon
    -0.18
    ange
    -0.18
    icht
    -0.17
    POSITIVE LOGITS
    ramework
    0.17
    otor
    0.17
    iom
    0.16
    iw
    0.16
    quiv
    0.15
    ectors
    0.15
    rani
    0.15
    YA
    0.15
    inder
    0.15
    yal
    0.14
    Act Density 0.041%

    No Known Activations