INDEX
    Explanations

    various forms of the word "ad" or references to advertisements

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.05
    3:0.06
    4:0.06
    5:0.05
    6:0.41
    7:0.05
    8:0.05
    9:0.06
    10:0.07
    11:0.04
    Negative Logits
    leep
    -1.44
    ¯
    -1.42
    usercontent
    -1.41
    icked
    -1.31
     Hallow
    -1.23
     seniors
    -1.20
     compuls
    -1.19
    sed
    -1.16
    etheless
    -1.15
     Jolly
    -1.14
    POSITIVE LOGITS
    igham
    1.74
    utical
    1.45
    IRO
    1.44
    ciation
    1.36
    iscons
    1.35
    ignt
    1.33
    emale
    1.30
    1.30
    rouse
    1.28
    combe
    1.27
    Act Density 0.002%

    No Known Activations