INDEX
    Explanations

    mentions of the word "man" with varying emphasis

    repeated references to the word "man."

    New Auto-Interp
    Negative Logits
    Import
    -0.69
    PsyNetMessage
    -0.69
    IVERS
    -0.64
     Prompt
    -0.63
    Recommend
    -0.60
    Girls
    -0.60
    PT
    -0.60
    Integ
    -0.60
     Supplement
    -0.59
    Democratic
    -0.59
    POSITIVE LOGITS
    hunt
    1.54
    nered
    1.35
    uscript
    1.29
    hood
    1.27
    gling
    1.26
    hattan
    1.15
    osphere
    1.12
    abase
    1.10
    orah
    1.02
    ila
    1.01
    Act Density 0.067%

    No Known Activations