INDEX
    Explanations

    phrases related to aggressive actions or criticisms directed towards something or someone

    expressions related to aggressive confrontations and disputes

    New Auto-Interp
    Negative Logits
    artifacts
    -0.74
    OTA
    -0.71
    ylan
    -0.68
     Wonders
    -0.68
     Suc
    -0.66
    soDeliveryDate
    -0.66
    hess
    -0.65
    scope
    -0.65
     orderly
    -0.63
    sterdam
    -0.63
    POSITIVE LOGITS
     accusing
    1.20
     insults
    1.08
     slurs
    1.06
     leveled
    1.03
     slander
    1.03
     tir
    1.00
     denouncing
    0.99
     criticizing
    0.98
     accusation
    0.97
     dispar
    0.97
    Act Density 0.249%

    No Known Activations