INDEX
    Explanations

    punctuation and structural markers in text

    New Auto-Interp
    Negative Logits
    iani
    -0.18
    oft
    -0.16
    ublik
    -0.14
    aqu
    -0.14
    addock
    -0.14
    icket
    -0.14
    bia
    -0.14
    oji
    -0.14
    ocab
    -0.14
     leak
    -0.14
    POSITIVE LOGITS
     -*-č↵
    0.17
    ilda
    0.16
    465
    0.16
    teg
    0.15
    ê¹Į
    0.15
    -toggler
    0.15
    Cha
    0.15
    ÄĽtÃŃ
    0.15
     Cha
    0.15
     cha
    0.14
    Act Density 0.004%

    No Known Activations