INDEX
    Explanations

    Offensive language

    colloquial, informal or slangy language (including expletives) used in conversational tone.

    New Auto-Interp
    Negative Logits
    requires
    -0.07
    isAdmin
    -0.07
     فارس
    -0.07
     neden
    -0.07
     kod
    -0.07
     Indeed
    -0.06
    organizations
    -0.06
     FC
    -0.06
    öt
    -0.06
    γκο
    -0.06
    POSITIVE LOGITS
     shit
    0.07
    _stuff
    0.07
     stuff
    0.07
     cops
    0.07
    距离
    0.06
    .btnClose
    0.06
     bananas
    0.06
    (predicate
    0.06
    eview
    0.06
     maliyet
    0.06
    Act Density 0.317%

    No Known Activations