INDEX
    Explanations

    phrases where the words "the" and another word are close to each other

    repetitive use of the word "the."

    New Auto-Interp
    Negative Logits
    tackle
    -0.89
    anon
    -0.71
    thood
    -0.71
    =#
    -0.69
    adays
    -0.68
    quished
    -0.68
    iversal
    -0.67
     again
    -0.67
    still
    -0.66
    CLA
    -0.66
    POSITIVE LOGITS
     slightest
    1.31
     simplest
    1.29
     smallest
    1.16
     finest
    1.09
     cheapest
    1.06
     basics
    1.04
     hars
    1.00
     richest
    0.98
     easiest
    0.98
     wealthiest
    0.98
    Act Density 0.152%

    No Known Activations