INDEX
    Explanations

    phrases indicating a list of items or examples

    phrases that introduce or list examples and key ideas

    New Auto-Interp
    Negative Logits
    avery
    -0.89
    endor
    -0.88
    etz
    -0.87
    ulhu
    -0.85
    iolet
    -0.80
    uca
    -0.79
    undle
    -0.78
    amphetamine
    -0.75
    byss
    -0.75
    culosis
    -0.74
    POSITIVE LOGITS
     examples
    1.45
     reasons
    1.30
     highlights
    1.27
     excerpts
    1.23
     noteworthy
    1.16
     notable
    1.15
     salient
    1.15
     observations
    1.14
     ways
    1.12
     facts
    1.12
    Act Density 0.102%

    No Known Activations