INDEX
Explanations
derogatory comments or insults directed towards individuals
derogatory terms and insults directed at individuals or groups
New Auto-Interp
Negative Logits
ItemImage
-0.89
isSpecialOrderable
-0.81
displayText
-0.78
Flavoring
-0.77
tracks
-0.75
Cover
-0.71
Enlarge
-0.71
Newsletter
-0.71
qua
-0.70
also
-0.70
POSITIVE LOGITS
asshole
1.51
idiots
1.50
idiot
1.43
bastard
1.38
bitch
1.33
cunt
1.33
fuck
1.32
retard
1.27
bullshit
1.26
jerk
1.24
Activations Density 0.380%