INDEX
    Explanations

    the letter 'w' in various contexts within the text

    New Auto-Interp
    Negative Logits
    yg
    -0.20
    r
    -0.19
     unw
    -0.17
    rav
    -0.17
    ÙĦ
    -0.17
    il
    -0.17
    ر
    -0.17
    y
    -0.16
    rang
    -0.16
    ys
    -0.16
    POSITIVE LOGITS
    ester
    0.23
    arden
    0.21
    ondrous
    0.21
    alled
    0.20
    iser
    0.20
    ares
    0.20
    avy
    0.20
    alter
    0.20
    iley
    0.19
    asser
    0.18
    Act Density 0.015%

    No Known Activations