INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.08
    2:0.07
    3:0.08
    4:0.09
    5:0.08
    6:0.09
    7:0.08
    8:0.09
    9:0.08
    10:0.07
    11:0.07
    Negative Logits
     transsexual
    -2.88
    mot
    -2.79
     Salary
    -2.62
     decriminal
    -2.58
     prostitution
    -2.57
     surname
    -2.55
    wives
    -2.52
     Wanted
    -2.51
    selling
    -2.50
     Negro
    -2.49
    POSITIVE LOGITS
     Kirin
    2.91
    2.82
    aceous
    2.65
    keyes
    2.40
    INO
    2.32
    2.32
     exquisite
    2.30
     Tiff
    2.25
    �士
    2.24
    623
    2.20
    Act Density 0.000%

    No Known Activations