INDEX
    Explanations

    news article metadata such as publication dates and titles

    instances of the word "First"

    New Auto-Interp
    Negative Logits
     Wer
    -0.64
    mble
    -0.62
    steen
    -0.62
     Canaver
    -0.62
    termination
    -0.61
    edom
    -0.61
     Genie
    -0.59
     avoidance
    -0.58
    geries
    -0.58
    nery
    -0.58
    POSITIVE LOGITS
     Published
    0.90
     Posts
    0.77
     Posted
    0.75
     posted
    0.74
     archived
    0.74
     Upload
    0.68
    Posted
    0.66
    post
    0.64
    published
    0.64
     Download
    0.63
    Act Density 0.027%

    No Known Activations