INDEX
    Explanations

    instances of contradiction or doublespeak in statements

    New Auto-Interp
    Head Attr Weights
    0:0.05
    1:0.04
    2:0.01
    3:0.09
    4:0.05
    5:0.17
    6:0.04
    7:0.02
    8:0.07
    9:0.38
    10:0.01
    11:0.01
    Negative Logits
    apo
    -2.49
    ieu
    -2.47
    ema
    -2.10
    akable
    -2.04
     Submit
    -2.03
    depth
    -2.00
    utor
    -1.95
    ivities
    -1.94
    icter
    -1.90
    rition
    -1.83
    POSITIVE LOGITS
     mentions
    2.39
     ALSO
    1.96
     conspic
    1.92
     Phelps
    1.92
     prominently
    1.91
     also
    1.91
    .)
    1.88
     coincided
    1.86
     Rowling
    1.86
     fared
    1.84
    Act Density 0.129%

    No Known Activations