INDEX
    Explanations

    punctuation, specifically periods at the end of sentences

    New Auto-Interp
    Negative Logits
    YG
    -0.17
     comm
    -0.16
    ooks
    -0.15
     Hatch
    -0.15
     cur
    -0.14
    .cur
    -0.14
    IVE
    -0.14
    heel
    -0.14
    ifica
    -0.14
    ive
    -0.14
    POSITIVE LOGITS
    astr
    0.17
    ewn
    0.15
    arin
    0.15
    imson
    0.14
    olem
    0.14
    PEND
    0.14
    мага
    0.14
    alink
    0.13
    PLY
    0.13
    emachine
    0.13
    Act Density 0.003%

    No Known Activations