INDEX
    Explanations

    phrases related to reflection and self-awareness

    New Auto-Interp
    Negative Logits
     conmigo
    -1.02
     comigo
    -0.96
     me
    -0.90
     Itself
    -0.88
     asegurarse
    -0.82
    私に
    -0.79
     dirinya
    -0.78
     magát
    -0.78
    itself
    -0.78
    私の
    -0.75
    POSITIVE LOGITS
     ourselves
    2.71
     our
    1.18
     можем
    1.13
     знаем
    1.02
     будем
    0.96
     نحن
    0.92
    our
    0.92
     هستیم
    0.92
     видим
    0.91
     we
    0.89
    Act Density 0.459%

    No Known Activations