INDEX
    Explanations

    punctuation marks, specifically periods and commas

    New Auto-Interp
    Negative Logits
     purpoſe
    -1.00
     myſelf
    -0.93
     themſelves
    -0.92
     reaſon
    -0.92
     pleaſure
    -0.91
     ſtate
    -0.91
     perfons
    -0.90
     himſelf
    -0.89
     uſed
    -0.89
     itſelf
    -0.89
    POSITIVE LOGITS
    ...
    0.94
    0.88
     ...
    0.75
     …
    0.74
    ....
    0.66
    ……
    0.62
    ......
    0.59
    “...
    0.58
    --
    0.58
    CloseOperation
    0.58
    Act Density 0.188%

    No Known Activations