INDEX
    Explanations

    ways to identify and reference specific individuals, especially in contexts related to interactions or creative endeavors

    New Auto-Interp
    Negative Logits
     itself
    -0.20
     himself
    -0.16
     nÃło
    -0.14
     vše
    -0.14
    _FE
    -0.13
     Ø®ÙĪØ¯Ø´
    -0.13
    оло
    -0.13
    еÑĢж
    -0.13
    unga
    -0.13
     sám
    -0.13
    POSITIVE LOGITS
     themselves
    0.27
     respectively
    0.26
     alike
    0.22
     together
    0.20
     ê·¸ë¦¬ê³ł
    0.20
     respective
    0.19
     ê°ģê°ģ
    0.18
     their
    0.17
     Their
    0.17
     Together
    0.17
    Act Density 0.152%

    No Known Activations