INDEX
    Explanations

    mentions of 'fake news'

    references to "fake news."

    New Auto-Interp
    Negative Logits
    ayne
    -0.70
     Scher
    -0.70
    vasive
    -0.70
    asse
    -0.68
    inence
    -0.66
    arious
    -0.64
    inished
    -0.64
    xus
    -0.63
    urdue
    -0.63
     Swe
    -0.62
    POSITIVE LOGITS
    worthy
    1.05
    rooms
    1.01
    feed
    0.96
    room
    0.95
    worthiness
    0.92
     headlines
    0.87
    groups
    0.82
    peak
    0.82
     coverage
    0.81
    reader
    0.80
    Act Density 0.040%

    No Known Activations