INDEX
    Explanations

    phrases indicating actions or events happening in specific contexts

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.11
    3:0.05
    4:0.14
    5:0.03
    6:0.12
    7:0.28
    8:0.03
    9:0.03
    10:0.06
    11:0.06
    Negative Logits
    soDeliveryDate
    -1.59
    arte
    -1.46
    aza
    -1.42
    catentry
    -1.40
    ests
    -1.40
    ounge
    -1.37
     packages
    -1.36
    glers
    -1.36
    umper
    -1.35
    essee
    -1.33
    POSITIVE LOGITS
     disbelief
    1.83
     negativity
    1.68
     incompetence
    1.57
     insults
    1.46
     arrogance
    1.46
     sparks
    1.42
     disappointment
    1.42
     irrational
    1.42
     goof
    1.40
     halluc
    1.39
    Act Density 0.003%

    No Known Activations