INDEX
    Explanations

    references to personal relationships and social dynamics involving individuals and groups

    New Auto-Interp
    Negative Logits
     �
    -0.20
     Â
    -0.19
    Âĸ
    -0.16
    ÂĶ
    -0.16
    ãĤĪãģĨãģ§ãģĻ
    -0.15
    Âĵ
    -0.15
    ÂĿ
    -0.14
     ÃĤ
    -0.14
    ´t
    -0.14
     ´
    -0.14
    POSITIVE LOGITS
    's
    0.82
    ’s
    0.75
    çļĦ
    0.63
    ìĿĺ
    0.54
    ãģ®
    0.45
    çļĦ大
    0.44
    ‘s
    0.43
    ´s
    0.43
     çļĦ
    0.43
    'S
    0.42
    Act Density 2.081%

    No Known Activations