INDEX
    Explanations

    referential phrases that emphasize various emphatic expressions or opinions

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.01
    2:0.05
    3:0.17
    4:0.06
    5:0.01
    6:0.40
    7:0.05
    8:0.03
    9:0.03
    10:0.04
    11:0.08
    Negative Logits
    Published
    -1.48
    ipel
    -1.39
    Chel
    -1.35
    850
    -1.34
    ashington
    -1.32
    second
    -1.25
     NX
    -1.23
    gam
    -1.20
    875
    -1.20
    shown
    -1.20
    POSITIVE LOGITS
    uddin
    1.52
    ody
    1.38
    WER
    1.33
    ierrez
    1.32
    ochet
    1.29
    inent
    1.25
    udi
    1.22
    UGH
    1.21
    tery
    1.20
    inence
    1.19
    Act Density 0.005%

    No Known Activations