INDEX
    Explanations

    references to Jewish cultural or religious identities and communities

    New Auto-Interp
    Negative Logits
    ãĥ³ãĥĦ
    -0.16
     Slee
    -0.15
    erc
    -0.15
    å·»
    -0.15
    anje
    -0.14
    iene
    -0.14
    stell
    -0.14
    iat
    -0.14
     Ramadan
    -0.14
    emarks
    -0.14
    POSITIVE LOGITS
    تÙĦ
    0.17
    TL
    0.14
     Michaels
    0.14
    帯
    0.14
    帶
    0.14
     Dro
    0.14
     dropping
    0.13
    oming
    0.13
    ama
    0.13
     Å
    0.13
    Act Density 0.008%

    No Known Activations