INDEX
    Explanations

    negative descriptions of behaviors and attitudes

    New Auto-Interp
    Negative Logits
    ebi
    -0.18
    517
    -0.16
    ofire
    -0.14
    ÑĦÑĢа
    -0.14
    urre
    -0.14
    asant
    -0.13
    æĭľ
    -0.13
    aylor
    -0.13
    imore
    -0.13
    anka
    -0.13
    POSITIVE LOGITS
     empathy
    0.44
     compassion
    0.43
     sympathy
    0.43
    compass
    0.41
     pity
    0.40
     Compass
    0.39
     empath
    0.37
     sympath
    0.36
     Emp
    0.35
    sy
    0.35
    Act Density 0.136%

    No Known Activations