INDEX
    Explanations

    the repeated use of a specific word or suffix indicating a pattern or theme

    New Auto-Interp
    Negative Logits
    terday
    -0.76
     Michaels
    -0.72
    Downloadha
    -0.68
    IDENT
    -0.66
     Leilan
    -0.65
     Staples
    -0.64
    ensional
    -0.63
    staking
    -0.63
     apprehension
    -0.61
     dors
    -0.58
    POSITIVE LOGITS
    gy
    1.20
    gers
    1.19
    roup
    1.12
    ues
    1.06
    roups
    1.06
    ogo
    1.05
    glers
    1.01
    raphic
    0.99
    uild
    0.97
    ging
    0.95
    Act Density 0.013%

    No Known Activations