INDEX
    Explanations

    punctuation marks, specifically commas and apostrophes

    New Auto-Interp
    Negative Logits
    umas
    -0.17
    affen
    -0.17
    riad
    -0.16
    gili
    -0.15
    anoia
    -0.15
    ssa
    -0.15
    pmat
    -0.14
    STALL
    -0.14
    arest
    -0.14
    rosse
    -0.14
    POSITIVE LOGITS
    ing
    0.20
     Grace
    0.16
     grace
    0.16
    sil
    0.15
    mods
    0.14
    Grace
    0.14
    arial
    0.14
    yper
    0.14
    _defs
    0.14
     silence
    0.14
    Act Density 0.007%

    No Known Activations