INDEX
    Explanations

    phrases related to personal relationships and emotional dependencies

    New Auto-Interp
    Negative Logits
     themselves
    -0.42
     itself
    -0.35
     kita
    -0.17
    ald
    -0.16
     Ø®ÙĪØ¯Ø´
    -0.16
    ÄĽl
    -0.15
    esin
    -0.15
    alm
    -0.15
     himself
    -0.15
    os
    -0.15
    POSITIVE LOGITS
     yourself
    0.82
     Yourself
    0.55
     yourselves
    0.54
     your
    0.46
    your
    0.43
    ä½łçļĦ
    0.39
    Your
    0.30
     ваÑĪ
    0.29
     можеÑĤе
    0.28
     votre
    0.27
    Act Density 1.150%

    No Known Activations