INDEX
    Explanations

    proper nouns, particularly names and titles

    New Auto-Interp
    Negative Logits
     horizont
    -0.89
     horr
    -0.88
     Thumbnails
    -0.79
     hemor
    -0.79
     behav
    -0.77
     iP
    -0.77
     ASC
    -0.74
     livest
    -0.73
     multic
    -0.73
     multip
    -0.72
    POSITIVE LOGITS
    enei
    1.22
    owsky
    1.22
    ovich
    1.13
    iani
    1.11
    owitz
    1.07
    awi
    1.06
    ati
    1.03
    cki
    1.01
    gger
    0.99
    olini
    0.99
    Act Density 0.238%

    No Known Activations