INDEX
    Explanations

    possessive forms indicating ownership or association

    New Auto-Interp
    Negative Logits
    å°ı说
    -0.21
    æĥħ
    -0.20
    人åĵ¡
    -0.19
    人æ°Ĺ
    -0.18
    大
    -0.18
    å¿ĥ
    -0.18
    人
    -0.17
    ’s
    -0.17
    人类
    -0.17
    人åijĺ
    -0.17
    POSITIVE LOGITS
     own
    0.26
     Own
    0.21
    ÂĢÂĻ
    0.21
     gotta
    0.19
    ÂĿ
    0.19
    own
    0.18
    '
    0.18
    -eye
    0.18
     sake
    0.17
     been
    0.17
    Act Density 0.709%

    No Known Activations