INDEX
    Explanations

    explicit and aggressive language directed at individuals

    "you" "idiot" "kill" "senseless"

    New Auto-Interp
    Negative Logits
    AnimationsModule
    -0.64
     وتسجيلات
    -0.57
    WebElementEntity
    -0.51
    posedge
    -0.51
    XmlAccessType
    -0.50
    новниш
    -0.50
     referenties
    -0.48
     Vergrößern
    -0.48
    CommonModule
    -0.47
     Obrador
    -0.47
    POSITIVE LOGITS
     Bet
    0.36
     Secret
    0.35
     "\"
    0.35
     bet
    0.35
     اح
    0.34
    󠁴
    0.34
     đồ
    0.34
     desear
    0.33
     secret
    0.33
     Pis
    0.32
    Act Density 0.030%

    No Known Activations