INDEX
    Explanations

    references to or mentions of men and related terms

    references to men and their presence in various contexts

    New Auto-Interp
    Negative Logits
    IVERS
    -0.84
    VICE
    -0.73
    REDACTED
    -0.69
    Closure
    -0.65
    Main
    -0.64
    Berry
    -0.63
    worthiness
    -0.63
    âĺħâĺħ
    -0.61
    KEN
    -0.60
    DATA
    -0.59
    POSITIVE LOGITS
    opausal
    1.45
    endez
    1.29
    volent
    1.27
    aced
    1.19
    orah
    1.19
    ager
    1.19
    aces
    1.17
    stru
    1.04
    folk
    1.00
    uscript
    0.99
    Act Density 0.080%

    No Known Activations