INDEX
    Explanations

    occurrences of the letter 's' in various contexts

    New Auto-Interp
    Negative Logits
    ocket
    -0.27
    n
    -0.23
    p
    -0.23
    ys
    -0.23
    ql
    -0.23
    cript
    -0.23
    hip
    -0.23
    ub
    -0.23
    к
    -0.22
    c
    -0.22
    POSITIVE LOGITS
    ras
    0.17
    rb
    0.17
    put
    0.17
    raman
    0.17
    osos
    0.17
    chez
    0.16
    os
    0.16
    oso
    0.16
    odal
    0.16
    meal
    0.15
    Act Density 0.028%

    No Known Activations