INDEX
    Explanations

    references to literary works or books in various contexts

    references to religious texts and their titles

    New Auto-Interp
    Negative Logits
    ilitary
    -0.94
    asio
    -0.81
    orescence
    -0.79
     sclerosis
    -0.70
     corrosion
    -0.69
    ilty
    -0.69
    00200000
    -0.64
     democracy
    -0.62
    adow
    -0.62
    undai
    -0.61
    POSITIVE LOGITS
    stores
    1.24
    marks
    1.18
    Book
    1.13
    book
    1.10
    mark
    1.09
    she
    1.05
    store
    1.02
     Book
    1.02
    worm
    1.01
    seller
    0.96
    Act Density 0.019%

    No Known Activations