INDEX
    Explanations

    proper nouns, particularly names and organizational titles

    New Auto-Interp
    Negative Logits
     Jenny
    -0.18
    ħ
    -0.17
    uyo
    -0.16
    atest
    -0.16
    ahren
    -0.16
    owski
    -0.16
     Burnett
    -0.15
     Weiss
    -0.15
    arez
    -0.15
    okino
    -0.15
    POSITIVE LOGITS
    ilion
    0.20
    §
    0.18
    982
    0.17
     Alley
    0.17
    agher
    0.17
    Å¡
    0.17
     Walton
    0.17
     Foley
    0.16
    thro
    0.16
    alion
    0.16
    Act Density 0.243%

    No Known Activations