INDEX
    Explanations

    references to Jewish people and related terms that may indicate stereotypes or negative sentiments

    New Auto-Interp
    Negative Logits
     InputDecoration
    -0.44
     paraître
    -0.44
    itoare
    -0.39
     AppColors
    -0.38
    InputBorder
    -0.37
     szól
    -0.36
     jScrollPane
    -0.35
     Unterscheidung
    -0.35
    rungsseite
    -0.35
    composición
    -0.35
    POSITIVE LOGITS
     Jewish
    0.69
     Jews
    0.69
     Jew
    0.69
    Jewish
    0.68
     Hebrew
    0.64
    Jews
    0.63
     jewish
    0.62
     Judaism
    0.61
     Tear
    0.60
    Datuak
    0.60
    Act Density 1.899%

    No Known Activations