INDEX
    Explanations

    phrases related to ethics and standards

    phrases related to ethical standards and responsibility

    New Auto-Interp
    Negative Logits
     latent
    -0.65
     nifty
    -0.64
     unwitting
    -0.62
     giant
    -0.62
     obscure
    -0.61
     coincidence
    -0.61
     moot
    -0.60
     tantal
    -0.59
     bombs
    -0.59
     ado
    -0.59
    POSITIVE LOGITS
     regardless
    1.11
     irrespective
    1.06
     wherever
    0.87
     respectfully
    0.80
      
    0.78
     throughout
    0.78
     RESP
    0.77
     whenever
    0.76
     safegu
    0.75
    .''.
    0.75
    Act Density 0.605%

    No Known Activations