INDEX
    Explanations

    words related to danger and risk

    references to danger or threats

    New Auto-Interp
    Negative Logits
    issance
    -0.71
    ulous
    -0.69
    eenth
    -0.66
    orney
    -0.66
    atters
    -0.65
    ergy
    -0.64
    GB
    -0.64
    pel
    -0.61
    guyen
    -0.60
    anmar
    -0.59
    POSITIVE LOGITS
    ously
    1.12
     lur
    1.01
     posed
    0.97
    ous
    0.94
     lurking
    0.88
     Danger
    0.84
     zone
    0.84
    OUS
    0.83
    saf
    0.83
     hazards
    0.81
    Act Density 0.035%

    No Known Activations