INDEX
    Explanations

    references to personal names and their attributes

    New Auto-Interp
    Negative Logits
     Naming
    -0.25
     naming
    -0.23
     titles
    -0.21
    Naming
    -0.20
     renaming
    -0.20
    (names
    -0.19
    å§ĵ
    -0.19
     nomin
    -0.18
     nick
    -0.18
     apellido
    -0.18
    POSITIVE LOGITS
    ame
    0.31
    na
    0.31
    Na
    0.30
     Na
    0.30
     na
    0.30
    _na
    0.27
    -na
    0.26
    name
    0.25
    AME
    0.24
     name
    0.24
    Act Density 0.103%

    No Known Activations