INDEX
    Explanations

    instances of posts and submissions attributed to users

    New Auto-Interp
    Negative Logits
    arer
    -0.18
    ilder
    -0.16
    erif
    -0.16
    773
    -0.15
    pedia
    -0.15
    Traits
    -0.15
    ÅĻ
    -0.15
     pÃŃs
    -0.14
    asu
    -0.14
    _Construct
    -0.13
    POSITIVE LOGITS
     Score
    0.15
     Ven
    0.15
     ven
    0.14
    otope
    0.14
     Stern
    0.14
     onto
    0.14
    -score
    0.14
    odesk
    0.14
     Lâm
    0.13
    score
    0.13
    Act Density 0.010%

    No Known Activations