INDEX
    Explanations

    variations of punctuation and periods, particularly at the end of sentences or phrases

    New Auto-Interp
    Negative Logits
     undert
    -0.18
     lifetime
    -0.16
    orage
    -0.16
     Rus
    -0.15
    roma
    -0.15
    Lifetime
    -0.14
     Lifetime
    -0.14
    agra
    -0.14
    QR
    -0.14
    ÏīÏĤ
    -0.13
    POSITIVE LOGITS
    zych
    0.20
     neob
    0.16
    edics
    0.15
    agnostic
    0.15
    رد
    0.14
     Lesser
    0.14
    linky
    0.14
    ValueChanged
    0.14
    itzer
    0.14
    HECK
    0.14
    Act Density 0.155%

    No Known Activations