INDEX
    Explanations

    specific names and positions associated with individuals or entities

    New Auto-Interp
    Negative Logits
     himself
    -0.32
     Himself
    -0.24
     his
    -0.20
    his
    -0.18
     seinen
    -0.18
     sám
    -0.17
     seiner
    -0.15
    ä»ĸçļĦ
    -0.14
     его
    -0.14
     seine
    -0.14
    POSITIVE LOGITS
     alike
    0.45
     respectively
    0.42
     respective
    0.30
     ê°ģê°ģ
    0.28
     themselves
    0.25
    åĪĨåĪ«
    0.24
     ÑģооÑĤвеÑĤ
    0.22
     sowie
    0.21
    两人
    0.21
     serta
    0.20
    Act Density 0.153%

    No Known Activations