INDEX
    Explanations

    phrases indicating readiness or willingness to take action

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.01
    2:0.08
    3:0.07
    4:0.09
    5:0.02
    6:0.09
    7:0.35
    8:0.02
    9:0.02
    10:0.08
    11:0.11
    Negative Logits
    acebook
    -1.61
    itures
    -1.46
     Tycoon
    -1.43
    uthor
    -1.31
    umblr
    -1.30
    inity
    -1.29
     Influence
    -1.28
    vernment
    -1.28
    avery
    -1.26
     Bought
    -1.26
    POSITIVE LOGITS
     unle
    1.56
     prepared
    1.56
     hardened
    1.55
    upt
    1.51
     ready
    1.51
    ppo
    1.46
     conting
    1.44
     seasoned
    1.39
     preparations
    1.34
     bursting
    1.33
    Act Density 0.013%

    No Known Activations