INDEX
    Explanations

    phrases related to controversial or potentially harmful topics, such as racially motivated attacks, compromised identity keys, health concerns linked to pesticides, and claims of false statements

    references to social and legal issues, particularly those involving crime, politics, and public sentiment

    New Auto-Interp
    Negative Logits
     Darling
    -0.52
    Jr
    -0.52
     concess
    -0.49
    mbuds
    -0.47
     Kop
    -0.47
     overe
    -0.46
    retty
    -0.46
    sit
    -0.46
    educ
    -0.46
    advert
    -0.46
    POSITIVE LOGITS
    )?
    0.94
    ¶
    0.84
     Belfast
    0.79
     ):
    0.76
     constitutes
    0.75
     violates
    0.75
    .--
    0.75
    .–
    0.72
     Copyright
    0.72
    "?
    0.72
    Act Density 1.375%

    No Known Activations