INDEX
    Explanations

    mentions of the word "anti" followed by a subsequent word

    references to anti-Semitic sentiments or themes

    New Auto-Interp
    Negative Logits
    mable
    -0.81
    ccording
    -0.77
     tremend
    -0.76
    manship
    -0.72
    doms
    -0.72
    matically
    -0.71
     corrid
    -0.67
     rall
    -0.65
     skelet
    -0.65
     rul
    -0.64
    POSITIVE LOGITS
    anti
    1.09
    ucci
    0.92
     Devi
    0.86
    iso
    0.86
    opsis
    0.85
    zona
    0.84
    qua
    0.82
    pora
    0.78
    oco
    0.78
    ctr
    0.77
    Act Density 0.013%

    No Known Activations