INDEX
    Explanations

    occurrences of the letter 't'

    New Auto-Interp
    Negative Logits
     Theſe
    -1.11
    ſelves
    -0.98
    ſelf
    -0.98
     themſelves
    -0.97
     Anſ
    -0.97
     ſever
    -0.94
     Beſ
    -0.93
     ſeveral
    -0.92
     doubtnut
    -0.92
     myſelf
    -0.91
    POSITIVE LOGITS
     t
    1.39
    t
    1.19
    T
    1.19
     T
    1.18
    getT
    1.04
    0.95
    𝘁
    0.93
    Viitteet
    0.85
    zt
    0.84
     Catt
    0.81
    Act Density 0.177%

    No Known Activations