INDEX
    Explanations

    punctuation marks and formatting indicators in text

    New Auto-Interp
    Negative Logits
     presses
    -0.17
    bane
    -0.15
    villa
    -0.15
    åħ¸
    -0.15
    rut
    -0.14
     Yer
    -0.14
     pressed
    -0.14
    kehr
    -0.14
    gar
    -0.14
     Press
    -0.14
    POSITIVE LOGITS
     Santos
    0.15
    dej
    0.15
    addOn
    0.15
    acock
    0.14
    QUIT
    0.14
    дон
    0.14
     prolong
    0.14
     lesb
    0.14
    ìļ´
    0.14
    είÏĦε
    0.14
    Act Density 0.007%

    No Known Activations