INDEX
    Explanations

    explicit warnings or cautions in a text

    phrases that emphasize safety precautions and warnings

    New Auto-Interp
    Negative Logits
    cart
    -0.69
     magically
    -0.67
     Founder
    -0.66
     descendants
    -0.63
     independents
    -0.62
    Spons
    -0.61
     creator
    -0.61
     Canad
    -0.60
    creator
    -0.60
    Joined
    -0.59
    POSITIVE LOGITS
     beware
    1.13
     precautions
    1.12
     caution
    1.11
     Avoid
    1.02
    Avoid
    0.99
    avoid
    0.98
     lest
    0.96
     carefully
    0.94
     heed
    0.94
     precaution
    0.92
    Act Density 0.845%

    No Known Activations