INDEX
    Explanations

    starts of new sections or paragraphs in a text

    instances of the word "Next" indicating a sequence or continuation in a text

    New Auto-Interp
    Negative Logits
    lees
    -0.66
    kay
    -0.64
    zinski
    -0.62
    ocker
    -0.61
    ans
    -0.61
    arbon
    -0.61
    acons
    -0.59
    ondon
    -0.58
    ogether
    -0.57
    ITH
    -0.57
    POSITIVE LOGITS
     Steps
    1.10
    door
    1.04
     week
    0.98
     steps
    0.96
     month
    0.94
     Generation
    0.92
     generation
    0.87
     Month
    0.86
     door
    0.85
     year
    0.85
    Act Density 0.028%

    No Known Activations