INDEX
    Explanations

    punctuation marks and their variations

    New Auto-Interp
    Negative Logits
    urdy
    -0.16
    èĤ¡
    -0.14
    hdl
    -0.14
    ilder
    -0.14
    703
    -0.14
     geri
    -0.14
     arma
    -0.14
     Clair
    -0.13
    249
    -0.13
    äft
    -0.13
    POSITIVE LOGITS
     Tos
    0.17
    izo
    0.15
     Barbar
    0.14
     (_,
    0.14
     Marg
    0.14
    .dt
    0.14
     Zo
    0.13
     Dere
    0.13
     Tween
    0.13
     civ
    0.13
    Act Density 0.012%

    No Known Activations