INDEX
    Explanations

    proper nouns, particularly names of people and places

    New Auto-Interp
    Negative Logits
    POCH
    -0.17
    ynet
    -0.17
    raya
    -0.17
    forman
    -0.16
    leta
    -0.15
    Ïĩν
    -0.15
    itele
    -0.15
    lds
    -0.15
    .nano
    -0.14
    ä¸ĢåĮº
    -0.14
    POSITIVE LOGITS
    uren
    0.18
    lingen
    0.17
    elen
    0.17
     Fernandez
    0.16
    ij
    0.16
    eren
    0.16
     Bur
    0.15
    eden
    0.15
    ellen
    0.15
     Eden
    0.15
    Act Density 0.012%

    No Known Activations