INDEX
    Explanations

    mentions of authors and their works

    New Auto-Interp
    Negative Logits
    ary
    -0.20
    iguous
    -0.16
    znik
    -0.16
    aq
    -0.14
    567
    -0.14
    aries
    -0.14
    ei
    -0.14
     McDon
    -0.14
    æĹı
    -0.14
    ARY
    -0.14
    POSITIVE LOGITS
    itative
    0.19
     admin
    0.18
     cb
    0.16
    etas
    0.15
    ama
    0.15
    izen
    0.15
     CB
    0.15
    \"
    0.15
    itarian
    0.15
    SHIP
    0.15
    Act Density 0.010%

    No Known Activations