INDEX
    Explanations

    mention of "the" and variations of "The" in the text

    New Auto-Interp
    Negative Logits
    ther
    -0.17
    ightly
    -0.17
     Fus
    -0.16
    icum
    -0.16
    mer
    -0.15
    ns
    -0.15
    actly
    -0.15
     fus
    -0.15
    sec
    -0.14
    structions
    -0.14
    POSITIVE LOGITS
    oretical
    0.25
    odore
    0.18
    orem
    0.17
    oret
    0.17
    Ģ
    0.16
    atre
    0.16
    viso
    0.16
    ostel
    0.15
    ERSHEY
    0.15
    tul
    0.15
    Act Density 0.301%

    No Known Activations