INDEX
    Explanations

    references to news outlets or news-related terms

    instances of the word "news" and its variants

    New Auto-Interp
    Negative Logits
     equival
    -0.69
    ãĤ©
    -0.68
    argon
    -0.67
    é¾
    -0.66
    verages
    -0.63
    DEBUG
    -0.63
    DoS
    -0.63
     IPM
    -0.61
    ashtra
    -0.61
    pmwiki
    -0.58
    POSITIVE LOGITS
    leans
    0.79
    izons
    0.74
    angled
    0.70
    pires
    0.70
    pac
    0.65
    lisher
    0.65
    phrine
    0.65
    enment
    0.64
    ilings
    0.64
    enhagen
    0.63
    Act Density 0.155%

    No Known Activations