INDEX
    Explanations

    repeated occurrences of the word "the."

    New Auto-Interp
    Negative Logits
    edList
    -0.15
    Reach
    -0.15
    ocale
    -0.15
    ussen
    -0.15
    atak
    -0.14
     DAMAGES
    -0.14
    нила
    -0.14
    '''č↵
    -0.14
     CONSEQUENTIAL
    -0.14
    vÃŃ
    -0.14
    POSITIVE LOGITS
     same
    0.24
     equivalent
    0.22
    same
    0.20
     beginnings
    0.19
     opportunity
    0.18
     ability
    0.18
     following
    0.17
    عÛĮ
    0.16
     option
    0.16
     même
    0.16
    Act Density 0.613%

    No Known Activations