INDEX
    Explanations

    proper nouns and specific names, particularly those related to locations

    New Auto-Interp
    Negative Logits
    ooks
    -0.17
    ior
    -0.16
    _INITIALIZER
    -0.15
     subt
    -0.15
    igit
    -0.15
    /gen
    -0.14
    avers
    -0.14
    à¹ĥà¸Ķ
    -0.14
    dw
    -0.14
    è¡Į
    -0.14
    POSITIVE LOGITS
    ixer
    0.20
    ix
    0.19
    yne
    0.18
    yme
    0.18
    inear
    0.17
    aub
    0.15
    ë§Ŀ
    0.15
    ussy
    0.15
    .forName
    0.15
    yro
    0.15
    Act Density 0.096%

    No Known Activations