INDEX
    Explanations

    occurrence of numbers and specific formatting characters

    New Auto-Interp
    Negative Logits
    rys
    -0.16
     clap
    -0.16
     stringWith
    -0.15
    uggy
    -0.15
     Corner
    -0.14
    undry
    -0.14
     Thrones
    -0.14
     Straw
    -0.14
     Sims
    -0.14
     Saunders
    -0.14
    POSITIVE LOGITS
     Monica
    0.15
     Ash
    0.15
    Ash
    0.15
    iban
    0.15
     rem
    0.15
    blo
    0.14
    -stat
    0.14
     Lucia
    0.14
     stats
    0.14
     Fab
    0.14
    Act Density 0.024%

    No Known Activations