INDEX
    Explanations

    inappropriate language or profanity

    instances of profanity and vulgar language

    New Auto-Interp
    Negative Logits
    Specific
    -0.79
    ItemImage
    -0.78
    wcs
    -0.72
     è£ıè
    -0.70
    inventoryQuantity
    -0.70
     condem
    -0.69
    åĬ
    -0.68
    ItemThumbnailImage
    -0.68
     narrowing
    -0.67
     complication
    -0.67
    POSITIVE LOGITS
    ruary
    0.96
    gerald
    0.82
    gling
    0.80
    ratulations
    0.77
    illet
    0.77
    ibly
    0.75
     laugh
    0.72
    sth
    0.72
    him
    0.72
    estro
    0.72
    Act Density 0.039%

    No Known Activations