INDEX
    Explanations

    punctuation marks and their frequency in the text

    New Auto-Interp
    Negative Logits
    1
    -0.21
    [
    -0.21
    x
    -0.20
    2
    -0.20
    4
    -0.20
    3
    -0.20
    10
    -0.20
    50
    -0.20
    .
    -0.19
    5
    -0.19
    POSITIVE LOGITS
     J
    0.35
     L
    0.34
     M
    0.34
     C
    0.33
     E
    0.33
     R
    0.32
     S
    0.32
     D
    0.32
     G
    0.32
     F
    0.32
    Act Density 0.020%

    No Known Activations