INDEX
    Explanations

    punctuation marks, specifically parentheses and periods

    New Auto-Interp
    Negative Logits
    ings
    -0.15
    lett
    -0.15
    ons
    -0.14
    -ÑĤо
    -0.14
    -↵↵
    -0.13
     respective
    -0.13
    reader
    -0.13
    xima
    -0.12
    aille
    -0.12
    аÐ
    -0.12
    POSITIVE LOGITS
    s
    0.36
    Ùĩ
    0.24
    y
    0.19
    à¸Ħ
    0.19
    sian
    0.18
    samp
    0.18
    i
    0.18
    ième
    0.18
    sak
    0.17
    ÛĮ
    0.16
    Act Density 0.223%

    No Known Activations