INDEX
    Explanations

    instances of the letter 'T' and related characters in the text

    New Auto-Interp
    Negative Logits
     tune
    -0.17
    ån
    -0.17
    ietet
    -0.15
    imoto
    -0.14
    quent
    -0.14
    ished
    -0.14
    gulp
    -0.14
    gua
    -0.14
    itored
    -0.14
    ittest
    -0.14
    POSITIVE LOGITS
    hat
    0.34
    o
    0.33
    here
    0.31
    he
    0.30
    his
    0.29
    hey
    0.28
    hen
    0.25
    h
    0.24
    hus
    0.24
     hat
    0.20
    Act Density 0.015%

    No Known Activations