INDEX
    Explanations

    mentions of Muslims and related topics, including various forms of identity and community presence

    New Auto-Interp
    Negative Logits
    Ĥ¹
    -0.16
    便
    -0.15
    esk
    -0.15
    ffffffff
    -0.15
    .ua
    -0.14
     Sparks
    -0.14
    edere
    -0.13
    atern
    -0.13
    "crypto
    -0.13
    AZY
    -0.13
    POSITIVE LOGITS
    sWith
    0.14
     addslashes
    0.14
     faith
    0.14
    meli
    0.14
    orio
    0.14
    sla
    0.14
    -Owned
    0.13
    utton
    0.13
    ves
    0.13
    aph
    0.13
    Act Density 0.011%

    No Known Activations