INDEX
    Explanations

    phrases related to taking responsibility for actions

    expressions of personal or collective responsibility

    New Auto-Interp
    Negative Logits
    glers
    -0.67
    inton
    -0.66
    wikipedia
    -0.63
    tering
    -0.63
    oby
    -0.62
    ropolis
    -0.62
    ãĥĩãĤ£
    -0.62
    erker
    -0.62
    tein
    -0.61
     Racer
    -0.61
    POSITIVE LOGITS
     responsibilities
    0.93
     accountable
    0.90
     responsibility
    0.89
     Responsibility
    0.81
    ibilities
    0.81
     for
    0.80
     forg
    0.80
    abilities
    0.78
     entrusted
    0.77
     responsibly
    0.77
    Act Density 0.040%

    No Known Activations