INDEX
    Explanations

    phrases indicating desire, expectations, and social relationships

    New Auto-Interp
    Negative Logits
    azes
    -0.14
    relay
    -0.14
     sami
    -0.14
    ì¿
    -0.14
    antics
    -0.14
    rell
    -0.14
    anson
    -0.14
    elastic
    -0.13
    ongan
    -0.13
    rap
    -0.13
    POSITIVE LOGITS
     happiness
    0.19
     succeed
    0.17
     receive
    0.17
    receive
    0.16
     success
    0.16
    .idea
    0.16
     welfare
    0.16
     receives
    0.15
     satisfaction
    0.15
    perience
    0.15
    Act Density 0.205%

    No Known Activations