INDEX
    Explanations

    words related to rejection or denial

    New Auto-Interp
    Head Attr Weights
    0:0.12
    1:0.03
    2:0.11
    3:0.03
    4:0.06
    5:0.09
    6:0.06
    7:0.07
    8:0.04
    9:0.07
    10:0.18
    11:0.09
    Negative Logits
    ��
    -1.26
    natureconservancy
    -1.24
    ghai
    -1.21
    angular
    -1.20
    assic
    -1.15
     Orbit
    -1.12
     Milky
    -1.11
    ocene
    -1.10
     breeze
    -1.06
     tidal
    -1.06
    POSITIVE LOGITS
     unsuccessfully
    1.35
     unsub
    1.25
     repeatedly
    1.19
     angrily
    1.15
    Examples
    1.15
     citing
    1.13
     amnesty
    1.13
     falsely
    1.12
     apologize
    1.10
    claim
    1.07
    Act Density 0.084%

    No Known Activations