INDEX
    Explanations

    numerical values and symbols in the text

    New Auto-Interp
    Negative Logits
    amas
    -0.85
    ames
    -0.84
    ipal
    -0.82
    ire
    -0.75
    iph
    -0.75
    oms
    -0.73
    ops
    -0.73
    ilty
    -0.71
    eree
    -0.71
    osh
    -0.71
    POSITIVE LOGITS
     Throughout
    1.03
     Later
    1.02
     Eventually
    1.01
     Nevertheless
    0.99
     Secondly
    0.99
     Initially
    0.98
     Earlier
    0.98
     Anyway
    0.97
     Shortly
    0.95
     Now
    0.95
    Act Density 0.154%

    No Known Activations