INDEX
    Explanations

    word indicating giveaways or promotions

    the end of the document or indicate document termination

    New Auto-Interp
    Negative Logits
     constitu
    -0.79
    opausal
    -0.75
    )].
    -0.70
     destro
    -0.68
     occas
    -0.67
     exha
    -0.66
     nerv
    -0.65
    etheless
    -0.64
    vae
    -0.64
     submar
    -0.63
    POSITIVE LOGITS
    ings
    1.28
    away
    1.15
    aways
    1.04
    ers
    1.00
    ables
    1.00
     Yourself
    0.98
     Your
    0.95
    ners
    0.90
    ership
    0.89
    ments
    0.88
    Act Density 0.213%

    No Known Activations