INDEX
    Explanations

    punctuation and specific phrases or segments that indicate decision-making or significant moments of change

    New Auto-Interp
    Negative Logits
     represented
    -0.19
     meant
    -0.18
    represented
    -0.16
     portrayed
    -0.16
     depicted
    -0.15
     preceded
    -0.15
     seen
    -0.14
     presented
    -0.14
     supported
    -0.14
     Seen
    -0.14
    POSITIVE LOGITS
     ate
    0.29
     took
    0.26
     went
    0.26
     drank
    0.26
     threw
    0.25
     got
    0.25
     blew
    0.24
    went
    0.24
     flew
    0.24
     withdrew
    0.24
    Act Density 0.201%

    No Known Activations