INDEX
    Explanations

    significant nouns, particularly those related to people, places, and entities

    New Auto-Interp
    Negative Logits
    ghan
    -0.17
    aves
    -0.17
    overn
    -0.16
    gh
    -0.15
    (er
    -0.15
    agan
    -0.14
     Shore
    -0.14
     latter
    -0.14
    erm
    -0.14
    atten
    -0.14
    POSITIVE LOGITS
    ctest
    0.19
    vanished
    0.15
    ÅĻez
    0.14
    ebek
    0.14
    ((-
    0.14
    ivant
    0.14
    abbo
    0.14
    imli
    0.14
    etur
    0.13
     Hamp
    0.13
    Act Density 0.076%

    No Known Activations