INDEX
    Explanations

    articles or sections in a document that are followed by advertisements

    instances of advertisements or promotional content

    New Auto-Interp
    Negative Logits
    veland
    -0.70
     accus
    -0.69
    anus
    -0.66
     abusing
    -0.66
     homebrew
    -0.63
     aur
    -0.62
    ynchronous
    -0.61
     Emin
    -0.61
    xual
    -0.60
    mbuds
    -0.59
    POSITIVE LOGITS
    SPONSORED
    0.84
    Advertisement
    0.78
    VERTISEMENT
    0.77
    Space
    0.72
    ãĤ¨ãĥ«
    0.72
    Story
    0.71
    Related
    0.70
    ILE
    0.69
    JUST
    0.69
    Layer
    0.69
    Act Density 0.033%

    No Known Activations