INDEX
    Explanations

    names containing "Hen" or variations like "Henrique" or "Henrik"

    New Auto-Interp
    Negative Logits
    EED
    -0.79
    terday
    -0.73
    henko
    -0.68
    ATIVE
    -0.64
    eele
    -0.61
     SIGN
    -0.60
    Reference
    -0.59
    ateral
    -0.59
    FACE
    -0.59
    align
    -0.58
    POSITIVE LOGITS
    rique
    1.23
    riks
    1.12
    rik
    1.10
    ning
    1.05
    nery
    1.04
    sel
    0.99
    ricks
    0.95
    riot
    0.94
    etr
    0.94
    ned
    0.93
    Act Density 0.025%

    No Known Activations