INDEX
    Explanations

    references to safety regulations and standards

    New Auto-Interp
    Negative Logits
     mik
    -0.64
     Jackman
    -0.64
     Cooke
    -0.61
    Cyclo
    -0.59
    Bindable
    -0.58
    ITHUB
    -0.57
     بست
    -0.57
    mik
    -0.57
     Tup
    -0.56
     vỡ
    -0.56
    POSITIVE LOGITS
     safety
    2.01
    Safety
    1.99
     Safety
    1.99
    safety
    1.85
     SAFETY
    1.83
    SAFETY
    1.75
    afety
    1.75
     Sicherheits
    1.09
     安全
    1.07
    Precautionary
    1.07
    Act Density 0.061%

    No Known Activations