INDEX
    Explanations

    advertisements in a document

    indicators of advertisement or promotional content

    New Auto-Interp
    Negative Logits
    ĪĴ
    -0.76
    opian
    -0.72
    rency
    -0.72
    ħĭ
    -0.67
    acea
    -0.66
     ashes
    -0.63
     fruitful
    -0.63
    uality
    -0.62
    amiya
    -0.62
    rite
    -0.62
    POSITIVE LOGITS
    ][/
    0.76
     WATCHED
    0.74
     sidx
    0.68
    eh
    0.66
     Jac
    0.66
    ]"
    0.66
    advertisement
    0.64
    inately
    0.63
     Abrams
    0.62
     IMAGES
    0.61
    Act Density 0.071%

    No Known Activations