INDEX
    Explanations

    terms related to self-centeredness and self-promotion

    New Auto-Interp
    Negative Logits
    ungs
    -0.15
    oux
    -0.15
    KS
    -0.15
     è¢
    -0.14
    apa
    -0.14
    ella
    -0.14
    vos
    -0.14
    lus
    -0.14
    ks
    -0.14
     intim
    -0.14
    POSITIVE LOGITS
     self
    0.20
    /self
    0.19
     Self
    0.17
    nish
    0.17
    same
    0.16
    -right
    0.16
    stown
    0.16
    (self
    0.15
     congrat
    0.15
     righteous
    0.15
    Act Density 0.014%

    No Known Activations