INDEX
    Explanations

    occurrences of the word "the."

    New Auto-Interp
    Negative Logits
    apan
    -0.15
    andi
    -0.15
    pu
    -0.14
    iera
    -0.14
    ccoli
    -0.13
    ायन
    -0.13
    Atl
    -0.13
     sub
    -0.13
    nie
    -0.13
    zug
    -0.13
    POSITIVE LOGITS
    isl
    0.14
    bih
    0.14
    raquo
    0.14
    iná
    0.14
    fulness
    0.14
    orex
    0.13
    icha
    0.13
    CHED
    0.13
    à¥ģण
    0.13
    FR
    0.13
    Act Density 0.027%

    No Known Activations