INDEX
    Explanations

    names of individuals

    proper nouns, particularly names

    New Auto-Interp
    Negative Logits
    é¾įå¥ij士
    -0.90
    sburgh
    -0.78
    è¦ļéĨĴ
    -0.67
    BILITY
    -0.66
    ä½ľ
    -0.64
     butterfly
    -0.64
    BILITIES
    -0.63
    ENCE
    -0.62
    ments
    -0.62
    ITED
    -0.61
    POSITIVE LOGITS
    igans
    1.08
    ghai
    1.08
    amar
    1.06
    kees
    1.04
    amo
    1.00
    seys
    0.99
    agos
    0.98
    atta
    0.96
    amia
    0.95
    allo
    0.93
    Act Density 0.033%

    No Known Activations