INDEX
Explanations
explicit and aggressive language directed at individuals
"you" "idiot" "kill" "senseless"
New Auto-Interp
Negative Logits
AnimationsModule
-0.64
وتسجيلات
-0.57
WebElementEntity
-0.51
posedge
-0.51
XmlAccessType
-0.50
новниш
-0.50
referenties
-0.48
Vergrößern
-0.48
CommonModule
-0.47
Obrador
-0.47
POSITIVE LOGITS
Bet
0.36
Secret
0.35
"\"
0.35
bet
0.35
اح
0.34
0.34
đồ
0.34
desear
0.33
secret
0.33
Pis
0.32
Activations Density 0.030%