INDEX
    Explanations

    instances of blame and lack of accountability

    New Auto-Interp
    Negative Logits
    icari
    -0.17
    اÙģÙĤ
    -0.14
    iband
    -0.14
    entai
    -0.14
    renc
    -0.14
     kå
    -0.14
    _TI
    -0.14
    èĻij
    -0.14
    pragma
    -0.14
    ajs
    -0.14
    POSITIVE LOGITS
     blame
    0.58
     blaming
    0.46
     blames
    0.46
     blamed
    0.44
     responsibility
    0.37
     fault
    0.36
    责任
    0.33
    責
    0.32
     Responsibility
    0.31
    Respons
    0.29
    Act Density 0.240%

    No Known Activations