INDEX
    Explanations

    instances of "next" and related phrases indicating progression or future steps

    Tokens preceding the word "step" or "episode"

    New Auto-Interp
    Negative Logits
     myſelf
    -0.70
    новништво
    -0.58
    ſelves
    -0.55
     abſ
    -0.54
     isolado
    -0.53
     ſtand
    -0.53
     uſed
    -0.52
     himſelf
    -0.52
     raiſ
    -0.52
     Paglinawan
    -0.51
    POSITIVE LOGITS
     next
    1.40
    next
    1.31
    Next
    1.27
     Next
    1.24
     NEXT
    1.13
    NEXT
    1.13
     nästa
    0.99
     nächste
    0.98
     nächsten
    0.95
    NextPage
    0.95
    Act Density 0.481%

    No Known Activations