INDEX
    Explanations

    Israel and Jewish people

    New Auto-Interp
    Negative Logits
    marine
    -0.08
    “We
    -0.08
    .↵
    -0.07
    -0.07
    Germany
    -0.07
          
    -0.06
    stuff
    -0.06
    .onCreate
    -0.06
    Safe
    -0.06
     ku
    -0.06
    POSITIVE LOGITS
    xEB
    0.07
     мак
    0.07
    _EXEC
    0.06
     minY
    0.06
     lesbisk
    0.06
    porn
    0.06
    _filtered
    0.06
    _ACT
    0.06
     BUF
    0.06
     fotoğraf
    0.06
    Act Density 0.014%

    No Known Activations