INDEX
    Explanations

    terms associated with self-righteousness and superiority in moral contexts

    New Auto-Interp
    Negative Logits
     Lag
    -0.17
    lore
    -0.16
    anim
    -0.15
    lea
    -0.14
    ussen
    -0.14
    ay
    -0.14
    chat
    -0.14
    ta
    -0.13
    aan
    -0.13
     "
    -0.13
    POSITIVE LOGITS
    ismu
    0.16
    alf
    0.16
    ToOne
    0.16
    ism
    0.16
    aggio
    0.15
     Vul
    0.15
    ulance
    0.15
     Foto
    0.15
    edn
    0.15
    +xml
    0.14
    Act Density 0.116%

    No Known Activations