INDEX
Explanations
phrases related to acts of disrespect or disdainful behavior
New Auto-Interp
Head Attr Weights
0:0.03
1:0.01
2:0.06
3:0.05
4:0.13
5:0.03
6:0.08
7:0.29
8:0.04
9:0.03
10:0.11
11:0.10
Negative Logits
Moderate
-1.63
quickShipAvailable
-1.63
ORTS
-1.52
externalActionCode
-1.51
CVE
-1.51
FLAG
-1.48
Mathemat
-1.45
soDeliveryDate
-1.42
sym
-1.39
ItemThumbnailImage
-1.38
POSITIVE LOGITS
grate
1.78
opol
1.59
vengeance
1.52
Brus
1.52
beet
1.48
scraps
1.47
gore
1.45
ilk
1.43
hairs
1.43
BS
1.39
Activations Density 0.002%