INDEX
    Explanations

    punctuation and formatting cues in the text

    New Auto-Interp
    Negative Logits
    ynn
    -0.15
    wyn
    -0.15
    lien
    -0.14
    alon
    -0.14
     latter
    -0.14
     exclus
    -0.14
    .clips
    -0.14
    frames
    -0.14
    _ISO
    -0.14
    infra
    -0.13
    POSITIVE LOGITS
    ause
    0.17
    stants
    0.15
    etim
    0.15
    ëįķ
    0.14
    anine
    0.14
    اض
    0.13
    ÂĿ
    0.13
    ì´Ŀ
    0.13
     pás
    0.13
     ninh
    0.13
    Act Density 0.222%

    No Known Activations