INDEX
    Explanations

    instances of punctuation or numeric symbols in the text

    New Auto-Interp
    Negative Logits
    lesc
    -0.17
    ustos
    -0.15
    enaire
    -0.14
    ioni
    -0.14
    Ù쨳
    -0.14
    riel
    -0.14
     apt
    -0.14
    ootball
    -0.14
    evt
    -0.14
    à¥Įन
    -0.14
    POSITIVE LOGITS
    iyat
    0.15
    urname
    0.15
    .struts
    0.14
    entic
    0.14
    (LogLevel
    0.14
    olly
    0.14
    anka
    0.14
    ank
    0.14
    MUX
    0.14
    лÑĸн
    0.14
    Act Density 0.003%

    No Known Activations