INDEX
    Explanations

    phrases indicating a sequence or order of actions

    phrases indicating action or intention

    New Auto-Interp
    Head Attr Weights
    0:0.11
    1:0.04
    2:0.06
    3:0.08
    4:0.06
    5:0.12
    6:0.07
    7:0.08
    8:0.09
    9:0.07
    10:0.13
    11:0.05
    Negative Logits
    \":
    -0.97
    andowski
    -0.90
     obstruction
    -0.90
     Sidd
    -0.88
    spoken
    -0.88
    Penn
    -0.87
    Pod
    -0.86
     genius
    -0.85
    #$
    -0.84
     keyword
    -0.84
    POSITIVE LOGITS
    livion
    1.26
    byss
    1.23
    ensis
    1.11
    amus
    0.99
    0.98
    utics
    0.97
    iris
    0.95
    cules
    0.95
    utic
    0.94
    itia
    0.93
    Act Density 0.244%

    No Known Activations