INDEX
    Explanations

    references to inclusivity or collective terms

    New Auto-Interp
    Negative Logits
    hran
    -0.87
    yip
    -0.72
    culosis
    -0.69
     Gw
    -0.63
    IND
    -0.62
     KH
    -0.61
    artz
    -0.61
    hz
    -0.60
    isu
    -0.60
    Nap
    -0.59
    POSITIVE LOGITS
    usions
    1.00
    iances
    0.98
    uding
    0.96
     attendant
    0.94
    udes
    0.93
     kinds
    0.91
     associated
    0.84
     alike
    0.84
    iance
    0.83
    ocating
    0.83
    Act Density 0.051%

    No Known Activations