INDEX
    Explanations

    mentions of racism and controversy surrounding public figures

    New Auto-Interp
    Negative Logits
    iek
    -0.14
     increment
    -0.14
    AGMENT
    -0.14
    958
    -0.14
     increments
    -0.14
    ä»ĭ
    -0.14
    enberg
    -0.13
    croft
    -0.13
    adesh
    -0.13
    veillance
    -0.13
    POSITIVE LOGITS
     insensitive
    0.38
    Insensitive
    0.29
     offensive
    0.28
     Offensive
    0.25
     remarks
    0.24
    ensitive
    0.23
     comments
    0.23
     racially
    0.22
     sensitive
    0.22
    remarks
    0.21
    Act Density 0.060%

    No Known Activations