INDEX
    Explanations

    mentions of authors and contributors to literary works

    New Auto-Interp
    Negative Logits
    hani
    -0.15
     rep
    -0.15
     outcome
    -0.15
    asso
    -0.15
    ij
    -0.15
     crest
    -0.14
    asco
    -0.14
    ADDING
    -0.14
    bsp
    -0.14
    onna
    -0.14
    POSITIVE LOGITS
    .OS
    0.15
     gazet
    0.14
    sse
    0.14
    inÄĽ
    0.14
     اÙĨت
    0.14
    wik
    0.14
    NF
    0.14
    ORB
    0.14
     NF
    0.14
    atern
    0.13
    Act Density 0.002%

    No Known Activations