INDEX
    Explanations

    instances of punctuation marks, particularly dashes and ellipses, in the text

    New Auto-Interp
    Negative Logits
    acman
    -0.16
    åķ
    -0.16
    ause
    -0.15
    IGHL
    -0.15
    ife
    -0.15
    SOLE
    -0.14
    eed
    -0.14
    .dense
    -0.14
    aspers
    -0.14
    oul
    -0.14
    POSITIVE LOGITS
    oret
    0.17
    -uppercase
    0.15
    sdale
    0.14
     -*-č↵
    0.13
     hence
    0.13
     equals
    0.13
    fx
    0.13
    ayah
    0.13
    wiki
    0.13
     both
    0.13
    Act Density 0.121%

    No Known Activations