INDEX
    Explanations

    expressions related to offensiveness and inappropriate language

    New Auto-Interp
    Negative Logits
    __':
    
    -0.84
    saraba
    -0.77
    parsedMessage
    -0.76
     EconPapers
    -0.73
     WebDriverWait
    -0.72
     تضيفلها
    -0.68
    ]")]
    -0.68
    tagHelperRunner
    -0.66
     Савезне
    -0.65
    Искәрмәләр
    -0.65
    POSITIVE LOGITS
     offended
    1.60
     offend
    1.47
     offense
    1.38
     offence
    1.37
     offending
    1.35
     offensive
    1.23
     Offense
    1.14
    offensive
    1.13
     Offensive
    1.07
     affront
    1.07
    Act Density 0.426%

    No Known Activations