INDEX
    Explanations

    language related to safety and security

    phrases related to safety concerns

    New Auto-Interp
    Negative Logits
    dx
    -0.88
    issance
    -0.85
    eric
    -0.78
    eta
    -0.78
    igs
    -0.75
    sth
    -0.75
    naire
    -0.74
    sonian
    -0.69
    yss
    -0.66
    iguous
    -0.64
    POSITIVE LOGITS
     safety
    1.20
    ailability
    1.00
    safety
    0.95
    saf
    0.87
    Þ
    0.85
     practition
    0.84
     Safety
    0.81
    Safety
    0.80
     condem
    0.80
     ingred
    0.75
    Act Density 0.021%

    No Known Activations