INDEX
    Explanations

    phrases indicating a progression or transition from one state to another

    phrases indicating progressive actions or processes

    New Auto-Interp
    Negative Logits
    ids
    -0.72
    juven
    -0.69
     immortal
    -0.65
    pur
    -0.65
     blot
    -0.64
     rug
    -0.63
     sort
    -0.62
     stand
    -0.62
     quake
    -0.61
     mistaken
    -0.59
    POSITIVE LOGITS
    Through
    1.02
     Through
    0.99
    edIn
    0.81
    through
    0.81
    ategory
    0.77
    clair
    0.77
     thru
    0.74
     Collider
    0.72
    cape
    0.72
    ensibly
    0.72
    Act Density 0.007%

    No Known Activations