INDEX
    Explanations

    values or qualities such as kindness, fear, love, and cooperation

    concepts related to morality and societal values

    New Auto-Interp
    Negative Logits
    senal
    -0.67
    ERSON
    -0.65
     CFR
    -0.64
     pestic
    -0.63
     referen
    -0.63
    everal
    -0.59
    amins
    -0.58
    ahon
    -0.58
     ancest
    -0.58
     pocket
    -0.57
    POSITIVE LOGITS
    fulness
    0.95
    lessness
    0.87
    cellence
    0.84
    ism
    0.83
     itself
    0.74
     equals
    0.73
     alone
    0.73
    less
    0.71
     beg
    0.71
    ful
    0.70
    Act Density 0.413%

    No Known Activations