INDEX
    Explanations

    mentions of making progress or taking action

    phrases indicating future actions or decisions

    New Auto-Interp
    Negative Logits
     herself
    -0.66
    Downloadha
    -0.65
     accompanies
    -0.64
     denotes
    -0.61
     uttered
    -0.61
    assis
    -0.61
     Sample
    -0.61
    FUL
    -0.60
    âĺħ
    -0.60
    blance
    -0.59
    POSITIVE LOGITS
     ourselves
    1.46
     gonna
    0.87
     [
    0.78
     everybody
    0.78
    mble
    0.77
    selves
    0.76
     our
    0.72
     guys
    0.71
     together
    0.71
     gotta
    0.70
    Act Density 0.483%

    No Known Activations