INDEX
    Explanations

    negative constructs or phrases related to scrutiny and lack of accountability in various contexts

    New Auto-Interp
    Head Attr Weights
    0:0.06
    1:0.01
    2:0.19
    3:0.20
    4:0.12
    5:0.03
    6:0.06
    7:0.03
    8:0.10
    9:0.03
    10:0.05
    11:0.07
    Negative Logits
     anonym
    -1.46
     icing
    -1.43
    ynchron
    -1.39
     backlog
    -1.38
    tumblr
    -1.38
     mash
    -1.31
     imagining
    -1.29
     slur
    -1.27
     vaguely
    -1.27
     ranging
    -1.25
    POSITIVE LOGITS
     nor
    3.54
    nor
    2.35
     anymore
    2.20
    inventoryQuantity
    1.97
    irlf
    1.81
    Nor
    1.75
     Nor
    1.74
    omsky
    1.65
    ught
    1.64
    ridges
    1.56
    Act Density 0.019%

    No Known Activations