INDEX
    Explanations

    terms related to suicide and self-harm

    New Auto-Interp
    Negative Logits
    Personendaten
    -0.57
     rö
    -0.49
     dinosau
    -0.41
     dino
    -0.41
     hou
    -0.40
     Ladybug
    -0.40
     Yandex
    -0.39
     dyn
    -0.39
     Beur
    -0.39
    CodeDom
    -0.39
    POSITIVE LOGITS
     suicide
    1.32
    suicide
    1.17
    Suicide
    1.11
     suicidio
    1.02
     Suicide
    1.01
     suicides
    1.00
     suicidal
    0.97
    自杀
    0.90
    自殺
    0.85
     suic
    0.79
    Act Density 0.284%

    No Known Activations