INDEX
    Explanations

    specific URLs within text

    instances of the word "Link" as a reference to a hyperlink or connection in the text

    New Auto-Interp
    Negative Logits
    PDATE
    -0.87
    nces
    -0.85
    ãĥ£
    -0.71
     proble
    -0.70
     conflic
    -0.66
    teenth
    -0.64
     reproduce
    -0.64
    pty
    -0.64
    ktop
    -0.63
    ORK
    -0.63
    POSITIVE LOGITS
    edin
    1.52
    later
    1.39
    witz
    1.11
    ering
    0.97
    ed
    0.96
    age
    0.93
    ages
    0.92
    edIn
    0.90
    er
    0.89
    ery
    0.86
    Act Density 0.036%

    No Known Activations