INDEX
    Explanations

    negation or exclusivity in statements

    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.07
    2:0.08
    3:0.09
    4:0.08
    5:0.07
    6:0.07
    7:0.08
    8:0.09
    9:0.07
    10:0.08
    11:0.07
    Negative Logits
     Reincarn
    -3.10
     Canaver
    -2.60
     deceive
    -2.54
    endi
    -2.50
     contam
    -2.49
     prost
    -2.45
     actresses
    -2.44
     hypoc
    -2.36
    rans
    -2.34
     inaccur
    -2.31
    POSITIVE LOGITS
    AZ
    2.77
    Berry
    2.76
    Haw
    2.63
    Nik
    2.62
    QL
    2.59
    Jam
    2.57
    PF
    2.54
    ql
    2.54
    amaz
    2.54
     Dwell
    2.54
    Act Density 0.000%

    No Known Activations