INDEX
    Explanations

    proper nouns, particularly names of individuals and places

    New Auto-Interp
    Negative Logits
    iggins
    -0.16
    oice
    -0.15
     Abs
    -0.14
    inand
    -0.14
    culus
    -0.14
    UBE
    -0.14
    clair
    -0.14
     Zucker
    -0.13
    »¿
    -0.13
    335
    -0.13
    POSITIVE LOGITS
    orz
    0.18
    exus
    0.16
    oggles
    0.16
    657
    0.16
     Rune
    0.16
    inton
    0.15
     BirliÄŁi
    0.15
    fold
    0.15
    .Dial
    0.15
    ussen
    0.14
    Act Density 0.189%

    No Known Activations