INDEX
    Explanations

    punctuation marks, particularly parentheses and quotation marks

    New Auto-Interp
    Negative Logits
    toa
    -0.17
    UNT
    -0.15
    .asc
    -0.14
     Wy
    -0.14
    /opt
    -0.13
    âce
    -0.13
    èĪ
    -0.13
    mmo
    -0.13
    eid
    -0.13
    andler
    -0.13
    POSITIVE LOGITS
    ulla
    0.15
    insky
    0.14
    ús
    0.14
     McCabe
    0.14
    and
    0.14
    lec
    0.14
    mant
    0.13
    座
    0.13
    ¡
    0.13
    пи
    0.13
    Act Density 0.057%

    No Known Activations