INDEX
    Explanations

    words related to positive moral qualities or ethical concepts

    concepts related to moral or ethical goodness

    New Auto-Interp
    Negative Logits
    eters
    -0.77
    ĸļ
    -0.76
    oths
    -0.73
    ptin
    -0.72
    ategory
    -0.71
    otta
    -0.70
    opers
    -0.70
    pper
    -0.68
     Sturgeon
    -0.67
    kson
    -0.67
    POSITIVE LOGITS
     intentions
    1.14
     Samar
    1.11
     deeds
    1.07
    reads
    1.07
     deed
    1.06
    enough
    1.01
     luck
    1.00
    luck
    0.97
    bye
    0.94
     ol
    0.93
    Act Density 0.074%

    No Known Activations