INDEX
    Explanations

    end punctuation marks, particularly periods and quotes

    New Auto-Interp
    Negative Logits
    -0.39
    ↵↵
    -0.24
    ↵ ↵
    -0.19
    &nbsp
    -0.18
    ↵	↵
    -0.18
    ↵    ↵
    -0.18
    .
    -0.17
    ses
    -0.16
    ↵  ↵
    -0.16
    ↵		↵
    -0.16
    POSITIVE LOGITS
    jpg
    0.25
    This
    0.21
    The
    0.20
    pdf
    0.20
    It
    0.19
    png
    0.19
    They
    0.18
    These
    0.18
    "↵
    0.18
    And
    0.17
    Act Density 0.122%

    No Known Activations