INDEX
    Explanations

    terms related to specific types of human identification, such as race and gender

    suffixes indicating specific characteristics or attributes

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.82
    perty
    -0.77
    otation
    -0.70
    owitz
    -0.69
    HAEL
    -0.66
    arcity
    -0.65
    hitting
    -0.63
     showc
    -0.62
    earch
    -0.62
    senal
    -0.62
    POSITIVE LOGITS
    kees
    0.67
     gentleman
    0.65
    bush
    0.63
    hood
    0.62
    ALLY
    0.61
    Gate
    0.61
    lihood
    0.60
    baum
    0.58
     Protestant
    0.58
     boy
    0.58
    Act Density 0.191%

    No Known Activations