INDEX
    Explanations

    names, particularly those that start with "Sh" or are phonetically similar

    New Auto-Interp
    Negative Logits
    avel
    -0.16
    ÌĨ
    -0.15
    tems
    -0.15
    eming
    -0.15
    CID
    -0.15
    APS
    -0.14
    ondo
    -0.14
    hte
    -0.14
    êµIJ
    -0.14
    jos
    -0.14
    POSITIVE LOGITS
    optimize
    0.17
    arda
    0.16
     Harden
    0.16
    ahn
    0.15
    rik
    0.15
     заклад
    0.14
    OA
    0.14
    ool
    0.14
    obia
    0.14
    oa
    0.14
    Act Density 0.019%

    No Known Activations