INDEX
    Explanations

    phrases related to conflicts or struggles

    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.01
    2:0.04
    3:0.24
    4:0.02
    5:0.06
    6:0.02
    7:0.04
    8:0.02
    9:0.02
    10:0.38
    11:0.01
    Negative Logits
     respectively
    -2.19
    SPONSORED
    -2.15
     graduates
    -2.08
     contrasted
    -2.08
     counterparts
    -1.94
     contrasts
    -1.91
     backdrop
    -1.90
     reclaimed
    -1.87
    -1.86
    -1.86
    POSITIVE LOGITS
     groove
    2.87
     ASAP
    2.25
     contrace
    2.03
     funk
    2.02
     enthus
    1.94
     smoking
    1.88
    Phones
    1.86
     trouble
    1.85
     Smoking
    1.84
     backdoor
    1.82
    Act Density 0.043%

    No Known Activations