INDEX
    Explanations

    email-related text such as email addresses, verification requests, and account creation steps

    New Auto-Interp
    Negative Logits
    utsche
    -0.75
    bite
    -0.65
    sych
    -0.63
     Panther
    -0.62
     Bears
    -0.61
     Rath
    -0.60
    tsky
    -0.60
    fml
    -0.60
    nz
    -0.60
     Telecom
    -0.59
    POSITIVE LOGITS
    prise
    1.03
    prising
    1.02
    prises
    0.97
    tainment
    0.94
    taining
    0.88
    tain
    0.84
    tained
    0.78
     captcha
    0.75
    Password
    0.74
    obar
    0.74
    Act Density 5.853%

    No Known Activations