INDEX
    Explanations

    warning messages or alerts

    warning statements related to potential dangers or sensitive content

    New Auto-Interp
    Negative Logits
    aepernick
    -0.85
    Laughs
    -0.78
    Favorite
    -0.76
    ichick
    -0.76
    obbies
    -0.76
    Interview
    -0.74
    brates
    -0.73
    hement
    -0.71
    chens
    -0.69
     excuse
    -0.69
    POSITIVE LOGITS
     dangers
    1.45
     impending
    1.24
     risks
    1.18
     danger
    1.16
     pitfalls
    1.12
     beware
    1.06
     dire
    1.05
     imminent
    1.03
    danger
    0.96
     consequences
    0.95
    Act Density 0.268%

    No Known Activations