INDEX
    Explanations

    phrases related to actions or intentions involving going, doing, or saying

    expressions of intention or desire related to actions and activities

    New Auto-Interp
    Negative Logits
    requires
    -0.75
     Increasing
    -0.69
    marked
    -0.68
     unsurprisingly
    -0.67
     flagged
    -0.66
    surprisingly
    -0.65
     millenn
    -0.65
     Advantage
    -0.65
     strikingly
    -0.64
     noteworthy
    -0.63
    POSITIVE LOGITS
     stay
    1.32
     get
    1.18
     participate
    1.18
     finish
    1.13
     talk
    1.12
     listen
    1.10
     survive
    1.08
     hear
    1.07
     marry
    1.06
     communicate
    1.06
    Act Density 0.414%

    No Known Activations