INDEX
    Explanations

    expressions and sentiments related to empathy and personal concern

    New Auto-Interp
    Negative Logits
     themselves
    -0.19
    allee
    -0.16
    idual
    -0.16
    irie
    -0.16
    igma
    -0.16
    umes
    -0.15
    egas
    -0.15
     nám
    -0.15
    mey
    -0.14
    oru
    -0.14
    POSITIVE LOGITS
     myself
    0.40
     personally
    0.24
     бÑĥдÑĥ
    0.18
    zz
    0.16
     jsem
    0.15
    zzo
    0.15
    erior
    0.14
     ÑħоÑĩÑĥ
    0.14
     numberWith
    0.14
    *)((
    0.14
    Act Density 0.515%

    No Known Activations