INDEX
    Explanations

    punctuation marks, particularly periods and question marks

    New Auto-Interp
    Negative Logits
    hd
    -0.06
    ADATA
    -0.06
    åζ
    -0.06
    ml
    -0.06
    yt
    -0.06
     Michaels
    -0.06
    insky
    -0.06
    vard
    -0.06
    haft
    -0.06
    berry
    -0.05
    POSITIVE LOGITS
    plural
    0.08
     plural
    0.07
    _SUITE
    0.07
    sehen
    0.07
    pson
    0.06
    aty
    0.06
     же
    0.06
     Yeni
    0.06
    ué
    0.06
    _patch
    0.06
    Act Density 0.136%

    No Known Activations