INDEX
    Explanations

    phrases related to safety advice and warnings

    warnings and advice related to safety

    New Auto-Interp
    Negative Logits
     fame
    -0.69
     descendants
    -0.69
     creator
    -0.67
     magically
    -0.66
     victorious
    -0.66
     unveiling
    -0.66
    created
    -0.64
     transformed
    -0.64
     Divinity
    -0.63
     Noble
    -0.63
    POSITIVE LOGITS
     caution
    1.59
     beware
    1.58
    eware
    1.30
     precautions
    1.25
    Avoid
    1.23
     precaution
    1.22
    avoid
    1.17
     vigilance
    1.16
     advis
    1.16
     advisable
    1.16
    Act Density 1.291%

    No Known Activations