INDEX
    Explanations

    references to demographics and social groups

    New Auto-Interp
    Negative Logits
    +#+#
    -0.64
    ,:),
    -0.61
    帖最后由
    -0.60
    complexContent
    -0.60
    ]=>
    -0.57
     Normdatei
    -0.54
    addContainerGap
    -0.53
     gdyż
    -0.53
    hoeddwyd
    -0.52
    Hauptartikel
    -0.51
    POSITIVE LOGITS
     who
    0.92
     with
    0.83
     across
    0.82
     everywhere
    0.79
     whose
    0.71
     in
    0.70
     around
    0.69
     that
    0.67
     from
    0.66
     without
    0.60
    Act Density 0.424%

    No Known Activations