INDEX
    Explanations

    words related to morality

    discussions related to concepts of morality and ethics

    New Auto-Interp
    Negative Logits
    eding
    -0.85
    ept
    -0.82
    location
    -0.81
    mining
    -0.77
    upon
    -0.74
     Roses
    -0.72
    aways
    -0.71
    eworld
    -0.70
    WER
    -0.70
    berry
    -0.70
    POSITIVE LOGITS
    ocracy
    0.85
    ¿½
    0.77
     contag
    0.75
    ocratic
    0.71
     Petr
    0.70
    ocrats
    0.69
    acus
    0.67
     compass
    0.65
     prev
    0.64
     righteousness
    0.63
    Act Density 0.020%

    No Known Activations