INDEX
    Explanations

    punctuation marks and special characters in the text

    New Auto-Interp
    Negative Logits
     IPS
    -0.17
    ut
    -0.17
    ie
    -0.16
    el
    -0.16
    io
    -0.15
    ul
    -0.15
    IDA
    -0.15
    era
    -0.15
    ik
    -0.15
     CPS
    -0.15
    POSITIVE LOGITS
    LOPT
    0.21
    beits
    0.21
    VOKE
    0.21
    BOVE
    0.19
    ceptar
    0.19
    sembler
    0.19
    INDOW
    0.19
    ninger
    0.19
    OLVE
    0.19
    ARGET
    0.19
    Act Density 0.114%

    No Known Activations