INDEX
    Explanations

    punctuation, particularly periods

    New Auto-Interp
    Negative Logits
    eç
    -0.16
    675
    -0.16
    odia
    -0.15
    INST
    -0.15
    otre
    -0.15
    ewan
    -0.15
    esel
    -0.15
    aoke
    -0.14
    fter
    -0.14
    zdy
    -0.14
    POSITIVE LOGITS
    лим
    0.16
    Ïĥαν
    0.14
     BaÄŁ
    0.13
    ãĥ¡ãĥ©
    0.13
    leDb
    0.13
     Alf
    0.13
     exhaust
    0.13
    íĻ
    0.13
    216
    0.13
    Ïģια
    0.13
    Act Density 0.030%

    No Known Activations