INDEX
    Explanations

    phrases related to moral values and ethical principles

    New Auto-Interp
    Negative Logits
    backer
    -0.76
    stal
    -0.70
    anie
    -0.70
    ETA
    -0.68
    biz
    -0.66
    Guard
    -0.65
    bg
    -0.63
    inar
    -0.61
    aukee
    -0.61
    gallery
    -0.60
    POSITIVE LOGITS
     there
    0.79
     homosexuality
    0.76
     "[
    0.75
     although
    0.73
     "...
    0.70
     preserving
    0.69
     '[
    0.69
     legalizing
    0.68
     "â̦
    0.67
     they
    0.66
    Act Density 0.181%

    No Known Activations