INDEX
    Explanations

    concepts related to altruism and selflessness in human behavior

    New Auto-Interp
    Negative Logits
     chez
    -0.18
     внÑĥÑĤÑĢи
    -0.15
    inside
    -0.15
    ÙĦØ·
    -0.14
    Within
    -0.13
    within
    -0.13
    ufe
    -0.13
    Inside
    -0.13
     Inch
    -0.13
    ниÑĤÑĮ
    -0.13
    POSITIVE LOGITS
     in
    0.68
     Ïĥε
    0.32
    åľ¨
    0.29
     ÙģÙĬ
    0.29
     în
    0.27
    à¹ĥà¸Ļ
    0.26
     åľ¨
    0.25
     à¹ĥà¸Ļ
    0.25
     در
    0.25
    Âłin
    0.24
    Act Density 0.815%

    No Known Activations