INDEX
    Explanations

    references to social connections and interpersonal relationships

    New Auto-Interp
    Negative Logits
     è©ķ価
    -0.14
    .Undef
    -0.14
    aktu
    -0.14
    .bc
    -0.14
    adia
    -0.13
    ibox
    -0.13
    oÅĽci
    -0.13
    ฤ
    -0.13
    acs
    -0.13
     itself
    -0.13
    POSITIVE LOGITS
     either
    0.22
     Either
    0.19
    Either
    0.18
     all
    0.18
    either
    0.17
     EITHER
    0.17
    们
    0.17
    zik
    0.16
    æ£Ĵ
    0.15
     often
    0.15
    Act Density 0.227%

    No Known Activations