INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     yer
    -0.07
    Prof
    -0.06
    badge
    -0.06
    иля
    -0.06
     pavement
    -0.06
    -0.06
     Colleg
    -0.06
     gbc
    -0.06
    IZE
    -0.06
    POSITIVE LOGITS
     simply
    0.07
     apresent
    0.07
    _cmd
    0.07
     aseg
    0.06
    Asian
    0.06
    shopping
    0.06
     şark
    0.06
    §
    0.06
     dating
    0.06
     ethnicity
    0.06
    Act Density 0.008%

    No Known Activations