INDEX
    Explanations

    concepts related to self-perception and social awareness

    New Auto-Interp
    Negative Logits
    ylko
    -0.15
    iyim
    -0.15
    ouse
    -0.14
     NSStringFromClass
    -0.14
     Kidd
    -0.13
    ral
    -0.13
    iddi
    -0.13
    appy
    -0.13
    weg
    -0.12
     Jewel
    -0.12
    POSITIVE LOGITS
    nings
    0.17
    nick
    0.15
    .simps
    0.14
    razier
    0.14
     æķħ
    0.14
    à¥įमà¤ķ
    0.14
    zego
    0.13
    olas
    0.13
    ATUS
    0.13
    oval
    0.13
    Act Density 0.455%

    No Known Activations