INDEX
    Explanations

    phrases indicating outcomes or consequences

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.03
    2:0.08
    3:0.09
    4:0.02
    5:0.04
    6:0.11
    7:0.08
    8:0.06
    9:0.26
    10:0.07
    11:0.10
    Negative Logits
    Reviewed
    -1.14
     Attempt
    -1.13
    itement
    -1.13
    undo
    -1.03
    vious
    -1.02
    -1.02
    conn
    -0.97
    upload
    -0.96
     ourselves
    -0.96
     Administration
    -0.96
    POSITIVE LOGITS
    hey
    1.04
    olor
    1.03
     peril
    1.00
     haunt
    0.99
     stride
    0.97
     setback
    0.96
     corro
    0.93
    spr
    0.92
    los
    0.91
    gewater
    0.91
    Act Density 0.041%

    No Known Activations