INDEX
    Explanations

    phrases indicating opposition or resistance to various actions or policies

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.05
    2:0.09
    3:0.04
    4:0.02
    5:0.06
    6:0.05
    7:0.05
    8:0.14
    9:0.07
    10:0.08
    11:0.27
    Negative Logits
    lia
    -1.23
     condensed
    -1.18
     optimized
    -1.14
     exqu
    -1.12
    FINEST
    -1.11
    staking
    -1.10
    itone
    -1.10
     manif
    -1.09
     insulated
    -1.06
     poked
    -1.06
    POSITIVE LOGITS
     anymore
    1.39
    cause
    1.31
     altogether
    1.17
    erous
    1.16
     Citation
    1.16
     whatsoever
    1.16
    comments
    1.16
     injust
    1.15
     harms
    1.15
     hurting
    1.13
    Act Density 0.076%

    No Known Activations