INDEX
    Explanations

    references to attacks or threats of violence

    New Auto-Interp
    Negative Logits
     للاسماء
    -0.52
    JspWriter
    -0.48
    setPointSize
    -0.46
     prome
    -0.46
    GTCX
    -0.46
    UniformLocation
    -0.44
    Décès
    -0.43
    FunctionFlags
    -0.42
     Grund
    -0.42
     beig
    -0.42
    POSITIVE LOGITS
     attack
    1.25
     Attack
    1.16
    attack
    1.14
     attacks
    1.10
     Attacks
    1.08
    Attack
    1.04
    Attacks
    1.00
    attacks
    0.99
     attacked
    0.98
     attacking
    0.97
    Act Density 0.107%

    No Known Activations