INDEX
    Explanations

    informational cues such as section headings and advertisements within a text

    repeated references to "story" and "advertisement" in the text

    New Auto-Interp
    Negative Logits
    cele
    -0.73
     Amit
    -0.63
     Imper
    -0.62
     monog
    -0.62
     cel
    -0.61
     unrecogn
    -0.60
    ste
    -0.60
     pri
    -0.59
    authent
    -0.58
    phen
    -0.58
    POSITIVE LOGITS
    iculty
    0.66
    espie
    0.66
    etary
    0.65
     Extras
    0.65
     VIDEOS
    0.64
    acters
    0.63
    yright
    0.63
    miah
    0.63
    ERY
    0.62
    iola
    0.62
    Act Density 0.109%

    No Known Activations