INDEX
    Explanations

    proper nouns, particularly names of individuals and their associated details

    New Auto-Interp
    Negative Logits
     repl
    -0.15
     ubiquitous
    -0.14
     multif
    -0.14
    adaki
    -0.13
    aded
    -0.13
    .vec
    -0.13
    uzey
    -0.13
    agan
    -0.13
    716
    -0.13
    .rev
    -0.13
    POSITIVE LOGITS
    pty
    0.17
    verter
    0.15
     appro
    0.14
    ynet
    0.14
    phins
    0.14
    setattr
    0.14
    ÑĤин
    0.13
    çuk
    0.13
    heavy
    0.13
     assim
    0.13
    Act Density 0.056%

    No Known Activations