INDEX
    Explanations

    phrases that emphasize the presence of "the" in various contexts

    New Auto-Interp
    Negative Logits
     remainder
    -0.18
     brighter
    -0.16
     sharper
    -0.16
    igham
    -0.16
    oten
    -0.15
    igger
    -0.15
    rens
    -0.15
     happier
    -0.14
    ichen
    -0.14
    eness
    -0.14
    POSITIVE LOGITS
     ici
    0.29
     rou
    0.26
     bol
    0.26
     fran
    0.26
     flatt
    0.25
     col
    0.25
     slee
    0.25
     dri
    0.24
     cris
    0.24
     slic
    0.23
    Act Density 0.262%

    No Known Activations