INDEX
    Explanations

    specific mentions of the word "The"

    repetitions of the word "the."

    New Auto-Interp
    Negative Logits
    pers
    -0.84
    tec
    -0.68
    .","
    -0.68
    blem
    -0.67
    soever
    -0.66
    ée
    -0.66
    ben
    -0.66
    wart
    -0.65
    ecided
    -0.65
    won
    -0.65
    POSITIVE LOGITS
     latter
    1.01
     aforementioned
    0.98
     foregoing
    0.94
     latest
    0.93
     simplest
    0.93
     same
    0.91
     emergence
    0.90
    oret
    0.89
     largest
    0.84
     aftermath
    0.83
    Act Density 0.437%

    No Known Activations