INDEX
    Explanations

    references to specific groups of people or demographic categories

    New Auto-Interp
    Negative Logits
    aneously
    -0.76
    stals
    -0.71
    ctors
    -0.66
    izens
    -0.64
    unts
    -0.64
    ints
    -0.63
     ..........
    -0.63
    stein
    -0.63
    neys
    -0.62
     Galile
    -0.60
    POSITIVE LOGITS
    heet
    1.37
    hip
    1.21
    hare
    1.16
    mith
    1.16
    cape
    1.07
    cale
    1.06
    ilver
    1.04
    pring
    1.04
    hift
    1.03
    pace
    1.01
    Act Density 0.119%

    No Known Activations