INDEX
    Explanations

    specific punctuation marks and formatting symbols within the text

    New Auto-Interp
    Negative Logits
     Bakan
    -0.15
    ALSE
    -0.15
    alse
    -0.15
    awner
    -0.15
    anz
    -0.15
    еÑĤÑĮÑģÑı
    -0.14
    ute
    -0.14
    бе
    -0.14
    itzer
    -0.14
    emez
    -0.14
    POSITIVE LOGITS
     Gall
    0.15
     ven
    0.15
     Tro
    0.15
    quette
    0.15
     Tub
    0.14
    enties
    0.14
     ret
    0.14
    ë°Ģ
    0.14
    G
    0.14
     gall
    0.13
    Act Density 0.027%

    No Known Activations