INDEX
    Explanations

    references to personal relationships and social dynamics

    New Auto-Interp
    Negative Logits
    ymoon
    -0.17
    nev
    -0.15
    šak
    -0.15
    мена
    -0.15
    edback
    -0.14
    Escort
    -0.14
    tvrt
    -0.14
    utsche
    -0.14
    quet
    -0.14
     bì
    -0.14
    POSITIVE LOGITS
    174
    0.18
    239
    0.15
    233
    0.15
    ering
    0.14
     Lib
    0.14
    agen
    0.14
    çĬ¶
    0.14
    ht
    0.14
    Perm
    0.14
    ahr
    0.13
    Act Density 0.077%

    No Known Activations