INDEX
    Explanations

    instances where the text advises skipping or jumping over certain content

    instructions or suggestions to skip sections of text

    New Auto-Interp
    Negative Logits
    amen
    -0.77
    lee
    -0.73
    orc
    -0.72
    lie
    -0.69
    lisher
    -0.68
    oran
    -0.68
    crim
    -0.67
    Reviewer
    -0.67
    rador
    -0.67
    eer
    -0.65
    POSITIVE LOGITS
     altogether
    0.87
     ahead
    0.77
     breakfast
    0.75
     overboard
    0.73
     vacations
    0.69
     puberty
    0.69
    ichi
    0.68
     detection
    0.67
     bothering
    0.67
     meals
    0.67
    Act Density 0.039%

    No Known Activations