INDEX
    Explanations

    punctuation marks

    punctuation marks, particularly commas and colons

    New Auto-Interp
    Negative Logits
    tarian
    -0.75
    backer
    -0.74
    atively
    -0.74
    ensible
    -0.73
    arted
    -0.73
    obook
    -0.72
    RF
    -0.71
    ory
    -0.70
    inking
    -0.70
    ought
    -0.70
    POSITIVE LOGITS
     qui
    1.09
     il
    1.05
     si
    1.04
     et
    1.01
     eh
    0.99
     ja
    0.98
     tu
    0.98
     ni
    0.97
     la
    0.95
     nun
    0.92
    Act Density 0.048%

    No Known Activations