INDEX
    Explanations

    punctuation, particularly commas and apostrophes

    New Auto-Interp
    Negative Logits
    -
    -0.66
    '
    -0.61
    -0.55
    2
    -0.53
    1
    -0.53
     Jack
    -0.52
    Jack
    -0.52
     of
    -0.48
     R
    -0.47
    3
    -0.47
    POSITIVE LOGITS
    .$,
    1.27
    !("{}",
    1.27
    __',
    1.25
    OGND
    1.25
    \"",
    1.22
    )",
    1.21
    >",
    1.20
     >=",
    1.19
    }",
    1.18
    ,",
    1.18
    Act Density 0.394%

    No Known Activations