INDEX
    Explanations

    punctuation marks and formatting symbols

    New Auto-Interp
    Negative Logits
    ueblo
    -0.16
    nds
    -0.15
    apur
    -0.15
    ayne
    -0.15
    ayın
    -0.14
    .Emit
    -0.14
    داد
    -0.14
    wich
    -0.13
    ovo
    -0.13
    ayız
    -0.13
    POSITIVE LOGITS
    ٳ
    0.16
    ĩnh
    0.14
    kke
    0.14
    stal
    0.14
    enberg
    0.14
    iom
    0.14
    arak
    0.14
     Berger
    0.14
    Utf
    0.13
     Utf
    0.13
    Act Density 0.329%

    No Known Activations