INDEX
    Explanations

    URLs or website links in the document

    New Auto-Interp
    Negative Logits
    plex
    -0.15
    lier
    -0.15
    ÅĻeb
    -0.15
    erk
    -0.15
    tle
    -0.14
     report
    -0.14
    yb
    -0.14
    ãĤ¹ãĤ¯
    -0.14
    ered
    -0.14
    edly
    -0.14
    POSITIVE LOGITS
    -content
    0.47
    /wp
    0.39
    content
    0.33
     content
    0.28
    Content
    0.28
    _content
    0.27
     Content
    0.27
    .wp
    0.26
    _CONTENT
    0.26
    ontent
    0.26
    Act Density 0.008%

    No Known Activations