INDEX
    Explanations

    expressions related to emotional distress or suicidal thoughts

    actions (verbs related to taking, giving, sending)

    New Auto-Interp
    Negative Logits
    Datuak
    -0.56
    UrlResolution
    -0.54
    principalColumn
    -0.52
     للمعارف
    -0.51
    rungsseite
    -0.49
     onData
    -0.47
    клопе
    -0.47
     uVar
    -0.46
    genossen
    -0.45
     vettor
    -0.45
    POSITIVE LOGITS
    0.75
    取り
    0.59
    0.56
     り
    0.55
    0.54
    立ち
    0.54
    書き
    0.53
    を作り
    0.53
    0.50
    0.50
    Act Density 0.010%

    No Known Activations