INDEX
    Explanations

    warnings and alerts

    sections related to warnings and alerts about potential dangers or consequences

    New Auto-Interp
    Negative Logits
    aepernick
    -0.78
    Favorite
    -0.72
    iques
    -0.70
    aez
    -0.69
     excuse
    -0.68
    Interview
    -0.66
    MRI
    -0.66
    yard
    -0.66
    athon
    -0.66
    oyer
    -0.65
    POSITIVE LOGITS
     dangers
    1.31
     risks
    1.09
     impending
    1.04
     pitfalls
    0.99
     danger
    0.98
     warnings
    0.93
    danger
    0.93
     hazards
    0.92
     dire
    0.89
     beware
    0.88
    Act Density 0.152%

    No Known Activations