INDEX
    Explanations

    phrases related to posing risks or threats

    phrases indicating potential threats or risks

    New Auto-Interp
    Negative Logits
    ciples
    -0.74
     Manufact
    -0.72
    cknowled
    -0.70
    essen
    -0.68
    mith
    -0.68
    azines
    -0.66
    endars
    -0.65
    enson
    -0.65
    write
    -0.65
    £ı
    -0.64
    POSITIVE LOGITS
     threat
    1.31
     hazard
    1.27
     danger
    1.20
     risk
    1.15
     risks
    1.07
     challenge
    1.05
     dangers
    1.02
     hazards
    0.98
     hurdle
    0.98
     peril
    0.98
    Act Density 0.155%

    No Known Activations