INDEX
    Explanations

    punctuation marks, particularly periods

    New Auto-Interp
    Negative Logits
    adolu
    -0.16
     Vern
    -0.15
    elyn
    -0.15
    amental
    -0.14
    olina
    -0.14
    ashi
    -0.14
    elmet
    -0.14
    ìļ±
    -0.14
    embedded
    -0.13
    ngine
    -0.13
    POSITIVE LOGITS
    aison
    0.16
    ice
    0.15
    ิว
    0.14
    unts
    0.14
    phant
    0.14
    rax
    0.14
    erman
    0.14
    cht
    0.13
    ryan
    0.13
     sesame
    0.13
    Act Density 0.009%

    No Known Activations