INDEX
    Explanations

    references to controversial historical figures and symbols

    New Auto-Interp
    Negative Logits
    çģ
    -0.14
    íķ
    -0.14
    icas
    -0.14
    hus
    -0.14
    ighest
    -0.14
    antium
    -0.13
    ritis
    -0.13
    ãĥĨãĥ«
    -0.13
     RELEASE
    -0.13
    Ú©ÛĮÙĦ
    -0.13
    POSITIVE LOGITS
     statue
    0.34
     statues
    0.32
     Confederate
    0.30
     symbols
    0.29
     symbol
    0.26
     Symbols
    0.26
     conf
    0.26
     monuments
    0.25
    symbols
    0.24
     removal
    0.24
    Act Density 0.076%

    No Known Activations