INDEX
    Explanations

    verbs expressing effort or willingness to take action

    phrases emphasizing the ability or effort to take action

    New Auto-Interp
    Negative Logits
     IB
    -0.66
     UR
    -0.63
     Politics
    -0.61
     Cait
    -0.60
     rejection
    -0.60
     Sands
    -0.60
     Passage
    -0.59
     leaked
    -0.58
     Irving
    -0.58
     Winner
    -0.56
    POSITIVE LOGITS
     muster
    1.02
    berra
    0.96
    't
    0.93
     feas
    0.89
     afford
    0.88
    adian
    0.83
     emulate
    0.82
    nesota
    0.80
    strip
    0.78
    aido
    0.77
    Act Density 0.081%

    No Known Activations