INDEX
    Explanations

    mentions of ethical matters or breaches

    references to ethics concepts and discussions

    New Auto-Interp
    Negative Logits
    nces
    -0.80
    upt
    -0.79
     Clockwork
    -0.69
    xual
    -0.69
    down
    -0.68
     Ingram
    -0.67
    noon
    -0.67
    ept
    -0.67
     Jub
    -0.67
    eworld
    -0.66
    POSITIVE LOGITS
    onomic
    1.11
    onom
    0.88
     dile
    0.83
    ostics
    0.82
    ethical
    0.80
     violations
    0.76
     watchdog
    0.74
     hazard
    0.73
     disclosure
    0.73
     breaches
    0.73
    Act Density 0.029%

    No Known Activations