INDEX
    Explanations

    information related to safety procedures or guidelines

    New Auto-Interp
    Negative Logits
    twitter
    -0.75
     Bashar
    -0.75
     Maduro
    -0.75
     Confederacy
    -0.73
     mutants
    -0.70
     presided
    -0.70
     Murdoch
    -0.70
     Plaint
    -0.69
    iannopoulos
    -0.69
     riots
    -0.69
    POSITIVE LOGITS
     beginner
    1.25
     beginners
    1.14
    Your
    1.06
    your
    1.04
     Tips
    1.02
     Helpful
    1.02
    Learn
    1.02
    Tips
    1.00
     Yourself
    0.98
     your
    0.97
    Act Density 3.554%

    No Known Activations