INDEX
    Explanations

    instances of derogatory language and criticisms directed towards individuals, particularly in the context of discourse about others

    New Auto-Interp
    Negative Logits
    IUrlHelper
    -0.60
     kasarigan
    -0.58
    adaptiveStyles
    -0.56
    fromnode
    -0.54
    matchCondition
    -0.52
    WebElementEntity
    -0.48
    -0.47
     mania
    -0.46
    -0.46
     MainAxisSize
    -0.45
    POSITIVE LOGITS
     insulting
    0.61
     criticisms
    0.59
     insults
    0.57
     derogatory
    0.56
     insult
    0.55
     criticism
    0.54
     accusations
    0.51
     dispar
    0.51
     mocking
    0.51
     disrespectful
    0.50
    Act Density 0.345%

    No Known Activations