INDEX
    Explanations

    repeated characters or symbols, particularly vowels with diacritics

    New Auto-Interp
    Negative Logits
    isci
    -0.18
    itaire
    -0.15
     Fiction
    -0.15
    mez
    -0.15
    jc
    -0.15
    ÙIJب
    -0.15
     Berger
    -0.15
    gli
    -0.14
    offs
    -0.14
    avers
    -0.14
    POSITIVE LOGITS
    ldre
    0.21
    olid
    0.17
    olian
    0.17
    gypt
    0.16
    olist
    0.16
    rz
    0.15
    t
    0.15
    neas
    0.15
    hn
    0.15
    onde
    0.15
    Act Density 0.006%

    No Known Activations