INDEX
    Explanations

    phrases related to emotional responses and interpersonal relationships

    New Auto-Interp
    Negative Logits
    à¹Ģà¸Ńà¸ĩ
    -0.17
    amins
    -0.17
     sebou
    -0.17
    entials
    -0.17
     him
    -0.16
     Humans
    -0.15
     mình
    -0.15
    éº
    -0.14
    à¸Ńà¸ļ
    -0.14
    gregar
    -0.14
    POSITIVE LOGITS
     their
    0.43
     THEIR
    0.36
     peoples
    0.33
    their
    0.32
     everyone
    0.31
     Their
    0.31
    Their
    0.30
     deren
    0.30
     theirs
    0.30
     everybody
    0.29
    Act Density 0.547%

    No Known Activations