INDEX
    Explanations

    negative sentiment or ethical issues

    New Auto-Interp
    Negative Logits
     collecte
    0.55
     사용하여
    0.52
     distributes
    0.51
     celebrates
    0.50
    ματο
    0.50
     pomoc
    0.50
    جمع
    0.50
    приклад
    0.49
     oluştur
    0.49
     confers
    0.49
    POSITIVE LOGITS
     resentment
    0.62
     Didn
    0.62
     violating
    0.62
     wrongdoing
    0.61
     worsening
    0.61
     uneasy
    0.60
     नहीं
    0.60
     tyranny
    0.59
     undermining
    0.59
     دلیل
    0.58
    Act Density 3.302%

    No Known Activations