INDEX
Explanations
terms that convey accusations or claims related to criminal activity
New Auto-Interp
Head Attr Weights
0:0.06
1:0.12
2:0.08
3:0.04
4:0.04
5:0.05
6:0.08
7:0.03
8:0.05
9:0.03
10:0.31
11:0.05
Negative Logits
Enh
-2.25
Function
-2.23
ALD
-2.22
isSpecialOrderable
-2.10
tuned
-2.08
IGN
-2.07
NF
-2.05
Dynamic
-2.03
Enabled
-2.02
Functions
-2.02
POSITIVE LOGITS
Rape
4.07
rape
3.87
rape
3.82
rapist
3.57
rapes
3.46
raped
3.16
raping
3.10
rapists
3.00
raped
2.84
Rap
2.25
Activations Density 0.000%