INDEX
    Explanations

    phrases related to warnings and alerts regarding significant issues or upcoming dangers

    New Auto-Interp
    Negative Logits
     Hairst
    -0.17
    ittle
    -0.15
    Answers
    -0.14
    heartbeat
    -0.14
    278
    -0.14
    .answers
    -0.13
    ividual
    -0.13
    (___
    -0.13
    orgot
    -0.13
     ANSW
    -0.13
    POSITIVE LOGITS
     warning
    1.05
     warnings
    0.95
     Warning
    0.88
     warn
    0.87
    warning
    0.83
    Warning
    0.80
     warned
    0.79
    warn
    0.76
    -warning
    0.75
    warnings
    0.75
    Act Density 0.283%

    No Known Activations