INDEX
    Explanations

    AI chatbot disclaimers mental health violence

    New Auto-Interp
    Negative Logits
     unfolded
    0.74
     depositphotos
    0.70
     சாப்பிட
    0.69
     undec
    0.68
    unia
    0.65
     පො
    0.65
     zet
    0.64
     გახ
    0.64
     clap
    0.64
     ചെയ
    0.63
    POSITIVE LOGITS
    Repl
    0.61
    人的
    0.60
    ắt
    0.56
    0.55
    LAM
    0.54
     Ble
    0.54
     Severe
    0.54
    0.54
    SA
    0.53
    Re
    0.53
    Act Density 0.156%

    No Known Activations