INDEX
    Explanations

    references to gender and race-related discrimination

    New Auto-Interp
    Negative Logits
     chỉnh
    -0.49
    tralight
    -0.46
     sánh
    -0.45
    IUrlHelper
    -0.45
    kary
    -0.43
    Sizes
    -0.43
    σμο
    -0.42
     arvio
    -0.41
    etap
    -0.41
    aj
    -0.41
    POSITIVE LOGITS
     незавершена
    0.83
     hObject
    0.83
    ništ
    0.76
    RetentionPolicy
    0.73
     utafitiHapana
    0.73
     CreateTagHelper
    0.73
    extAlignment
    0.73
    بوابة
    0.70
    IBOutlet
    0.70
     being
    0.69
    Act Density 0.657%

    No Known Activations