INDEX
    Explanations

    punctuation marks and their frequency in text

    New Auto-Interp
    Negative Logits
    oy
    -0.17
    urr
    -0.17
    urga
    -0.15
     Sick
    -0.15
    orer
    -0.15
    ton
    -0.14
    uch
    -0.14
    ÄĻki
    -0.14
    urg
    -0.14
    utton
    -0.14
    POSITIVE LOGITS
    linger
    0.15
     prere
    0.14
    oÅĻ
    0.14
     Rag
    0.14
    227
    0.14
    ronics
    0.14
    iman
    0.14
    echa
    0.14
    )")↵↵
    0.14
    adders
    0.14
    Act Density 0.162%

    No Known Activations