INDEX
    Explanations

    proper nouns, particularly names and surnames

    New Auto-Interp
    Negative Logits
    opt
    -0.14
    BILE
    -0.14
    .tm
    -0.14
    каз
    -0.14
    çłģ
    -0.14
    aker
    -0.13
     ÙħعÙĦ
    -0.13
    bak
    -0.13
    stra
    -0.13
    flater
    -0.13
    POSITIVE LOGITS
    alim
    0.17
    atz
    0.15
    avez
    0.15
    cta
    0.15
    illaume
    0.14
    wright
    0.14
    廳
    0.14
    bih
    0.13
    thin
    0.13
    KEEP
    0.13
    Act Density 0.519%

    No Known Activations