INDEX
    Explanations

    words related to moral or ethical duty

    mention of the concept of responsibility

    New Auto-Interp
    Negative Logits
    arthed
    -0.72
    sell
    -0.70
    vae
    -0.69
     Fort
    -0.66
     Hig
    -0.65
     Ellison
    -0.65
     Stall
    -0.64
     Maver
    -0.64
     Ashton
    -0.64
     Tigers
    -0.64
    POSITIVE LOGITS
     responsibility
    1.34
     responsibilities
    1.11
     Responsibility
    1.05
     respons
    0.97
    ignty
    0.90
    respons
    0.89
    responsible
    0.86
     culp
    0.85
    lessly
    0.81
     obligation
    0.81
    Act Density 0.012%

    No Known Activations