INDEX
    Explanations

    phrases related to morality and ethical considerations

    terms related to ethics and moral considerations

    New Auto-Interp
    Negative Logits
    xual
    -0.92
    -+
    -0.72
    oday
    -0.68
     Twice
    -0.66
    anian
    -0.66
    gery
    -0.65
    mble
    -0.64
    nces
    -0.64
    ptives
    -0.63
    lv
    -0.63
    POSITIVE LOGITS
     compass
    1.15
    istic
    1.08
     indignation
    1.05
    izing
    1.04
     relat
    1.00
     dile
    0.99
     hazard
    0.97
     equival
    0.96
    ising
    0.94
    IZE
    0.92
    Act Density 0.056%

    No Known Activations