INDEX
    Explanations

    punctuation marks and their variations

    New Auto-Interp
    Negative Logits
    ardin
    -0.18
    unal
    -0.17
    ahoma
    -0.17
    vide
    -0.17
    arden
    -0.15
     billig
    -0.15
    ikel
    -0.15
    ypad
    -0.15
    atten
    -0.14
    ÙĤÙħ
    -0.14
    POSITIVE LOGITS
    897
    0.18
    iban
    0.15
    deaux
    0.15
    onio
    0.15
     Ban
    0.14
    -toggler
    0.14
    çħ§
    0.14
    pis
    0.14
    å®
    0.14
    ç«
    0.14
    Act Density 0.030%

    No Known Activations