INDEX
    Explanations

    sentences that indicate a lack of personal responsibility or avoidance of blame

    New Auto-Interp
    Negative Logits
    гоÑĤ
    -0.16
    ast
    -0.15
    adel
    -0.14
    åĸľ
    -0.14
    ÃŃd
    -0.14
    elts
    -0.14
    iska
    -0.14
    jos
    -0.14
    -guard
    -0.14
    ORB
    -0.13
    POSITIVE LOGITS
    ampo
    0.17
    UDGE
    0.15
    baum
    0.15
    _FILL
    0.14
    ombine
    0.14
    dorf
    0.14
    strup
    0.14
    isin
    0.14
    utow
    0.14
    leftright
    0.13
    Act Density 0.529%

    No Known Activations